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Abstract 

A  set  of  fast  and  robust  electronic  video  stabilization  algorithms  are 
presented  in  this  thesis.  The  first  algorithm  is  based  on  a  two-dimensional 
feature-based  motion  estimation  technique.  The  method  tracks  a  small  set  of 
features  and  estimates  the  movement  of  the  camera  between  consecutive 
frames.  An  affine  motion  model  is  utilized  to  determine  the  parameters  of 
translation  and  rotation  between  images.  The  determined  affine 
transformation  is  then  exploited  to  compensate  for  the  abrupt  temporal 
discontinuities  of  input  image  sequences.  Also,  a  Frequency  domain 
approach  is  developed  to  estimate  translations  between  two  consecutive 
frames  in  a  video  sequence.  Finally,  a  jitter  detection  technique  has  been 
developed  to  isolate  vibration  affected  subsequences  from  an  image 
sequence.  The  experimental  results  of  using  both  simulated  and  real  images 
have  revealed  the  applicability  of  the  proposed  techniques.  In  particular,  the 
emphasis  has  been  to  develop  real  time  implementable  algorithms,  suitable 
for  unmanned  vehicles  with  severe  payload  constraints. 
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Fast  Video  Stabilization  Algorithms 


I.  Introduction 


1.1  Problem  Statement 

Assume  a  camera  rigidly  mounted  on  a  vehicle  in  motion.  If  the 
motion  of  the  vehicle  is  smooth,  so  will  be  the  corresponding  image 
sequence  taken  from  the  camera.  In  the  case  of  small  unmanned  aerial 
imaging  system,  and  off  road  navigating  ground  vehicles,  the  onboard 
cameras  experience  sever  jitter  and  vibration.  Consequently,  the  video 
images  acquired  from  these  platforms  have  to  be  preprocessed  to  eliminate 
the  jitter  induced  variations  before  human  analysis.  The  task  at  hand  is  to 
detect  the  jitter  and  eliminate  its  effect.  It  is  composed  of  two  subtasks:  First, 
to  develop  a  reliable  method  to  detect  in  real-time  the  subsequence  affected 
by  jitters.  Second,  to  develop  a  strategy  to  interpolate  the  images,  without 
sacrificing  detail  (dismount  targets). 

1.2  Research  Goal 

Motion  in  video  images  is  caused  by  either  the  object  motion  or  the 
camera  movement.  Digital  (electronic)  image  stabilization  (DIS/EIS)  system 
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endeavor  to  produce  a  compensated  video  sequence  so  that  image  motion 
due  to  the  camera’s  undesirable  vibration  or  juggles  can  be  removed  [1]. 
The  goal  of  this  research  is  to  introduce  a  new  approach  to  stabilize  image 
sequence.  The  newly  developed  algorithm  provides  a  fast  and  robust 
stabilization  system,  and  alters  real-time  performance. 

1.3  Applications  of  DIS 

Modem  (contemporary)  light  weight  digital  camera,  camcorders, 
CCD  sensing  arrays,  and  next-generation  mobile  phone  with  visual  display, 
etc.,  are  principal  candidates  in  need  of  automatic  image  stabilization.  They 
are  prone  to  inevitable  and  undesirable  camera  motion  during  the  image 
capturing  process.  It  would  be  worthwhile  to  have  a  digital  image  sequence 
stabilization  scheme  that  can  further  stabilize  the  image  sequence  for 
improving  the  subjective  quality  of  the  video  sequence  obtained.  Moreover, 
an  image  stabilization  algorithm  is  reported  to  be  beneficial  to  the  coding 
efficiency  of  video  signals  [6].  It  also  has  been  used  for  the  computation  of 
egomotion  [17,  18],  detection  and  tracking  of  Independently  Moving  Objects 
(IMOs)  [20,  21,  22],  and  video  compression  [19]. 

The  developed  algorithm  is  being  implemented  for  Unmanned  Arial 
Vehicle  (UAV)  surveillance  applications. 
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1.4  Organization  of  the  Study 


This  thesis  is  organized  into  five  chapters.  The  first  chapter  presents 
the  introduction,  problem  statement,  the  goal  of  the  research,  and  finally  it 
summarizes  the  three  types  of  image  stabilization  methods.  Chapter  two 
reviews  some  related  previous  works.  Chapter  three  presents  fundamental 
concepts  in  the  field  of  image  processing  which  are  necessary  to  understand 
the  methodology  used  to  solve  the  problem  being  studied.  Chapter  four 
explains  the  methodologies  and  the  techniques  used  to  implement  the 
various  algorithms.  Chapter  five  documents  the  data  resulted  from  the 
algorithms  test.  Chapter  six  summarizes  the  research,  including  limitations 
and  areas  of  future  work. 

1.5  Image  Stabilization  Methods 

There  are  three  types  of  image  stabilizers  currently  available  [23]: 
Digital  Image  Stabilization  (DIS),  Optical  Image  Stabilization  (OIS),  and 
Mechanical  Image  Stabilization  (MIS). 

1. 5. 1  Digital  Image  Stabilization 

Digital  Image  Stabilization  (DIS)  systems  use  electronic  processing  to 
control  image  stability.  The  DIS  system  starts  working  once  the  image  hits 
the  light-sensing  chip,  the  Charge  Coupled  Device  (CCD).  If,  through  its 
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sensors,  the  system  detects  what  it  thinks  is  camera  vibration,  it  responds  by 
slightly  moving  the  image  so  that  it  remains  in  the  same  place  on  the  CCD. 
For  example,  if  the  camera  vibrations  to  the  right,  the  image  moves  to  the 
left  to  compensate,  thus  eliminating  the  vibration  [23]. 

There  are  two  ways  DIS  works  to  reduce  the  perceived  movement  of 
the  image.  One  method  increases  the  size  of  the  image  by  digitally 
"Zooming"  in  on  the  image  so  that  it  is  larger  than  the  CCD.  By  making  the 
image  larger,  the  system  can  "pan  and  scan"  within  the  image  to  counter  the 
movement  created  by  the  vibration.  Because  this  system  must  digitally  zoom 
in  on  the  image  to  slightly  increase  its  size,  it  decreases  the  picture 
resolution  somewhat.  The  other  method  of  electronic  stabilization  uses  an 
oversized  CCD.  The  video  image  covers  only  about  90  percent  of  the  chip's 
area,  giving  the  system  space  in  which  to  move  the  image.  When  the  image 
is  stable,  the  chip  centers  the  image  on  the  CCD.  If  the  camera  vibrates  to 
the  right,  the  image  has  the  space  to  roam  to  the  left  to  compensate  for  the 
vibration,  keeping  the  subject  of  the  image  in  exactly  the  same  place  on  the 
CCD,  thus  eliminating  the  vibration. 

Detecting  the  vibration  is  key  to  the  effectiveness  of  the  system.  DIS 

systems  use  one  of  two  ways  to  detect  shaky  video.  Either  they  detect 

movement  within  the  image  as  recorded  on  the  CCD  or  they  detect  the  actual 

movement  of  the  camera.  The  first  method  of  detection  analyzes  the  changes 
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between  the  fields  in  each  image.  A  specially  designed  feature  of  the  camera 
stores  the  odd  and  even  fields  of  the  video  frame  and  look  for  changes 
between  them.  If  parts  of  the  image  change  in  one  field  but  not  the  other,  it 
indicates  that  the  subject  in  the  field  of  view  is  moving  but  not  the 
background.  If  however,  the  entire  image  changes  from  one  field  to  the  next, 
it  most  likely  means  there  is  camera  vibration  and  the  camera  must  correct 
the  image.  To  correct  the  camera  vibration,  the  camera's  electronics  detect 
the  direction  of  the  movement  and  shifts  the  active  field  so  that  it  meshes 
with  the  memorized  field.  A  major  disadvantage  of  this  system  is  that  if 
there  is  a  large  object  moving  in  the  frame,  it  may  be  interpreted  as  camera 
vibration  and  the  camera  will  attempt  to  stabilize  the  subject  causing  a 
blurring  of  the  image  and  reduction  in  picture  resolution.  The  camera  can 
also  use  motion  sensors  to  detect  camera  vibration.  Because  this  method 
senses  movement  in  the  camera  not  the  image,  the  movement  of  a  subject  in 
the  image  cannot  fool  it.  However,  it  will  sometimes  react  at  the  beginning 
of  an  intentional  camera  movement  (such  as  a  pan)  and  will  take  a  short 
moment  to  realize  that  you  are  moving  the  camera  on  purpose.  Instead  of  a 
smooth  pan,  the  image  will  freeze  and  then  leap  into  the  pan  suddenly  [23]. 
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1. 5. 2  Optical  Image  Stabilization 


The  Optical  Image  Stabilization  (OIS)  system,  unlike  the  DIS  system, 
manipulates  the  image  before  it  gets  to  the  CCD.  When  the  lens  moves,  the 
light  rays  from  the  subject  are  bent  relative  to  the  optical  axis,  resulting  in  an 
unsteady  image  because  the  light  rays  are  deflected.  By  shifting  the  IS  lens 
group  on  a  plane  perpendicular  to  the  optical  axis  to  counter  the  degree  of 
image  vibration,  the  light  rays  reaching  the  image  plane  can  be  steadied  [15]. 

Since  image  vibration  occurs  in  both  horizontal  and  vertical 
directions,  two  vibration-detecting  sensors  for  yaw  and  pitch  are  used  to 
detect  the  angle  and  speed  of  movement.  Then  the  actuator  moves  the  IS 
lens  group  horizontally  and  vertically  thus  counteracting  the  image  vibration 
and  maintaining  the  stable  picture.  The  Shift-IS  component  is  located  within 
the  lens  groups  and  is  most  effective  for  lower  frequency  movements  caused 
by  platform  vibration  or  wind  effect  without  increasing  the  overall  size  and 
weight  of  the  master  lens.  Figure  1-1  shows  an  illustration  of  this  type  of 
image  stabilization. 
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Figure  1-1:  Optical  Image  Stabilization  [15] 


1. 5. 3  Mechanical  Image  Stabilization 

Mechanical  image  stabilization  involves  stabilizing  the  entire  camera, 
not  just  the  image.  This  type  of  stabilization  uses  a  device  called  “Gyros”. 
Gyros  consist  of  a  gyroscope  with  two  perpendicular  spinning  wheels  and  a 
battery  pack.  Gyroscopes  are  motion  sensors.  When  the  gyroscopes  sense 
movement,  a  signal  is  sent  to  the  motors  to  move  the  wheels  to  maintain 
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stability.  The  gyro  attaches  to  the  camera’s  tripod  socket  and  acts  like  an 


"invisible  tripod"  [13]. 


Figure  1-2:  Gyroscopic  Stabilizer  [13] 

Figure  1-2  shows  a  picture  of  a  gyroscopic  stabilizer.  The  vibration 
gyro  was  improved  by  employing  a  tuning  fork  structure  and  a  vibration 
amplitude  feedback  control  [33].  They  are  heavy,  consume  more  power,  and 
are  not  suitable  for  energy  sensitive  and  payload  constrained  imaging 
applications. 
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II.  Literature  Review 

Many  methods  for  video  stabilization  have  been  reported  over  the  past 
few  years.  Most  proposed  methods  compensate  for  all  motion  [2,  18,  20,  24, 
25,  26],  producing  a  sequence  where  the  background  remains  motionless. 
Other  techniques  only  subtract  the  3D  rotation  of  the  camera  [27,  28,  29] 
generating  a  de-rotated  sequence.  However,  these  methods  can  be 
distinguished  by  the  models  adopted  to  estimate  the  camera  motion  [9]. 
Several  two-dimensional  and  three-dimensional  stabilization  schemes  are 
described  in  [24].  For  2D  models,  in  general  all  the  estimated  affine  motion 
parameters  are  compensated  for,  i.e.,  gross  motion  is  removed  from  the  input 
sequence  [20,  25,  and  21].  Stabilization  in  3D  is  achieved  by  re-rotating  the 
frames,  generating  a  translation-only  sequence,  or  a  sequence  containing 
translation  and  low-frequency  rotation.  Yao  et  al.  [29]  Compensate  for  3D 
rotation  by  tracking  multiple  visual  cues,  like  distant  points  and  horizon 
lines,  using  an  extended  Kalman  filter  for  the  estimation  of  the  3D  motion 
parameters  of  interest.  Both  kinematics  and  kinetic  models  suitable  for 
determining  the  smooth  and  oscillatory  rotational  motion  components  are 
considered,  so  that  smoothed  rotation  can  be  also  obtained.  A  vehicle  model 
is  also  used  in  [27]  to  filter  the  high-frequency  components  of  the  rotational 
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parameters.  A  flow-based  motion  estimator  applied  to  points  on  the  horizon 
(distant  points)  is  used  to  estimate  the  rotational  parameters,  and  the  solution 
is  recursively  refined  to  obtain  smoothed  motion.  Two-dimensional  models 
are  used  by  [17,  18,  19,  and  22].  Another  method  in  [19]  seeks  to  use  linear 
segments  from  the  input  images  and  align  them  with  the  absolute  vertical 
direction,  which  can  be  provided  by  an  inertial  sensor,  eliminating  the  need 
to  estimate  the  rotation  around  the  optical  axis.  Stabilization  is  achieved  by 
compensating  for  2D  linear  translation,  which  minimizes  the  disparity 
between  two  successive  frames. 

Fast  implementations  of  2D  stabilization  algorithms  are  presented  in 
[25,  20,  and  21].  Hansen  et  al.  [25]  describe  the  implementation  of  an  image 
stabilization  system  based  on  a  mosaic-based  registration  technique.  Burt  et 
al.  [20]  describe  a  system  which  uses  a  multi-resolution,  iterative  process 
that  estimates  affine  motion  parameters  between  levels  of  Laplacian  pyramid 
images.  From  coarse  to  fine  levels,  the  optical  flow  of  local  patches  of  the 
image  is  computed  using  a  cross-correlation  scheme.  The  motion  parameters 
are  then  computed  by  fitting  an  affine  motion  model  to  the  flow  [9]. 

Some  studies  follow  frequency  domain  algorithms  to  estimate  motion 

between  two  images  [30,  31,  and  32].  The  Fourier  transform  properties  of 

relocated  images  are  used  to  estimate  rotation  and  translation.  Frequency 

domain  methods  for  estimating  shifts  in  the  image  plane  are  based  on  the 
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fact  that  a  shift  in  spatial  domain  can  be  expressed  as  a  phase  shift  in 
frequency  domain.  Two  shifted  images  differ  only  by  a  linear  phase 
difference  [30,  31].  These  methods  can  be  extended  to  include  (planar) 
rotation  and  scale  using  polar  coordinates  [32]  with  the  advantage  that  shift, 
rotation  and  scale  can  be  estimated  separately.  The  main  limitation  of 
frequency  domain  methods  is  that  they  are  restricted  to  global  shifts  and 
rotations  in  the  image  plane,  and  scale  [11].  If  the  scene  is  composed  of 
multiple,  independently  moving  objects,  then,  the  method  will  not  provide 
adequate  performance. 

A  fast  and  robust  implementation  of  a  digital  image  stabilization 
algorithm  presented  in  this  thesis  is  based  on  the  2D  model  described  in  [1]. 

The  developed  algorithm  is  similar  to  the  other  algorithms  based  on 
the  2D  rigid  motion  model  [29].  But  instead  of  using  extensive  feature¬ 
tracking,  our  parametric  motion  model  is  obtained  by  tracking  only  a  small 
set  of  features  to  characterize  the  underlying  motion  vectors  and  produce 
equally  good  performance. 

The  algorithm  is  applied  to  translational  and  rotational  camera  motion 
separately. 
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III.  Background 


This  chapter  presents  basic  ideas  behind  image  stabilization,  and 
introduce  various  analytical  tools  used  in  literature  for  building  a  simple 
vibration  compensation  systems.  In  particular,  we  investigated  the  problem 
using  three  approaches:  (1)  levelsets  based  shape  analysis,  (2)  feature  points 
based  jitter  detection  and,  (3)  Fourier  transform  based  approach. 

3.1  Image  Sampling 

Before  an  image  can  be  manipulated  using  various  image  processing 
techniques,  it  must  be  spatially  sampled.  The  process  of  sampling  an  image 
is  the  process  of  applying  a  two-dimensional  grid  to  a  spatially  continuous 
image  to  discretize  it  into  a  two-dimensional  array  of  elements. 

Figure  3-1  shows  a  sampled  image  containing  a  total  of  NM  sampled 
elements  using  a  rectangular  grid.  Any  type  of  sampling  grid  can  be  used, 
but  the  rectangular  grid  is  by  far  the  most  common  because  of  its 
relationship  to  two-dimensional  arrays.  The  fundamental  unit  of  a  sampled 
image  is  a  picture  element  and  is  typically  referred  to  as  a  pixel.  The  value 
of  each  pixel  is  equal  to  the  average  intensity  of  the  continuous  spatial  image 
covered  by  that  pixel. 
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X  Direction 


N  x  M  array  of  intensity  values 


Figure  3-1:  Spatially  sampled  image  containing  Nx  M  picture  elements 

The  result  of  sampling  produces  a  two-dimensional  array  of  numbers 
that  are  directly  proportional  to  the  intensity  levels  of  the  continuous  spatial 
image.  Real-time  video  data  is  usually  digitized  over  a  320x240,  640x480, 
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768x525,  or  1600x1200  grid  according  to  the  context.  Many  of  these  size- 
resolution  combinations  were  chosen  to  be  compatible  with  the  spatial  size 
of  NTSC  video  and  to  meet  the  storage  size  requirements  of  digital  memory. 
Image  size  that  are  powers  of  two  exist  because  of  the  requirements  for 
computing  the  Fast  Fourier  Transform  (FFT),  to  be  considered  later. 


3.2  Quantization 

Besides  spatial  sampling,  the  intensity  level  at  each  pixel  must  also  be 
digitized  into  a  finite  set  of  numbers.  The  process  of  digitization  converts  an 
analog  intensity  value  into  a  set  of  digital  numbers  that  represent  the 
intensity  levels  in  the  image.  The  quantity  of  numbers  used  to  represent  the 
intensities  in  a  continuous  tone  image  determines  the  final  quality  of  the 
digitization  process.  This  set  of  numbers  is  referred  as  the  gray  levels  or 
grayscales  of  an  image. 

Since  an  image  is  the  spatial  distribution  of  light  energy,  the  numbers 
assigned  to  gray  levels  of  a  digitized  image  can  take  only  positive  values. 
Figure  3-2  (a)  gives  a  4x4  sub-image  taken  from  an  image.  Figure  3-2  (b) 
gives  the  corresponding  grayscale,  with  the  value  of  0  assigned  to  black  and 
each  grayscale  value  increasing  in  intensity  until  the  value  of  255  is  reached, 
corresponding  to  white. 
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Figure  3-2:  An  example  of  (a)  a  sampled  and  digitized  4x4  sub-image  and  (b)  its 

corresponding  grayscale 

3.3  Converting  Grayscale  Images  to  Binary  Image  Using 
Thresholding 

Thresholding  is  an  image  processing  technique  for  converting  a 

grayscale  or  color  image  to  a  binary  image  based  upon  a  thresholding  value. 

If  a  pixel  in  the  image  has  an  intensity  value  less  than  the  threshold  value  k 

(i .Q.,f(x,y)<k),  the  corresponding  pixel  in  the  in  the  resulting  image  is  set  to 

0  (black).  Otherwise,  if  the  pixel  intensity  value  is  greater  or  equal  to  the 
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threshold  intensity  k  (i .Q.,f(x,y)>k),  the  resulting  pixel  is  set  to  255  (white). 
Thus,  it  is  used  to  create  a  binary  image,  or  an  image  with  only  2  colors, 
black  (0)  and  white  (255).  This  can  be  formulated  as  follows: 


f(x,y) 


f(x,y)  <  k 
f(x,y)>k 


The  last  equation  can  be  generalized  as  follows: 


f(x,y) 


\Ga  f(x,y)<k 
1  Gb  f(x,y)  >  k 


(3-1) 


(3-2) 


where,  Ga  and  Gh  are  the  desired  two  gray  levels  in  the  threshold  image. 

The  process  of  thresholding  as  described  by  equation  8  reduces  a 
multilevel  image  to  a  two  gray-level  image  containing  gray  levels  Ga  and  Gb. 
Equation  (  3-2)  can  be  expanded  to  include  more  than  one  threshold  value  as 
follows: 


f(x,y)  =  < 


Ga 

®-f(x,y)<kx 

Gb 

f(x,y)<k2 

Gc 

k2  <  f(x,y )  <  G1 

(3-3) 


where,  Gmax  is  the  maximum  allowable  gray  level  of  the  image  f(x,y)  (255  in 
case  of  8-bit  gray-scaled  image).  And  ki,  and  k2  are  threshold  values. 
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3.4  Histogram 


The  brightness  characteristic  of  an  image  can  be  concisely  described 
with  a  tool  known  as  the  brightness  histogram.  The  brightness  histogram 
describes  the  frequency  distribution  of  the  gray  levels  of  pixels  within  a 
digital  image.  It  provides  a  graphical  representation  of  how  many  pixels 
within  an  image  fall  into  a  given  image. 

A  histogram  appears  as  a  graph  with  “brightness”  on  the  horizontal 
axis  from  0  to  255  (for  an  8-bit  gray  scale)  and  “number  of  pixels”  on  the 
vertical  axis.  To  find  the  number  of  pixels  having  a  particular  brightness 
within  an  image,  we  simply  look  up  the  brightness  on  the  horizontal  axis, 
follow  the  bar  graph  up,  and  read  off  the  number  of  pixels  on  the  vertical 
axis.  Because  all  pixels  must  have  some  brightness  value  defining  them,  the 
number  of  pixels  in  each  brightness  column  adds  up  to  the  total  number  of 
pixels  in  the  image. 

Let’s  assume  that  an  image  has  been  digitized  and  sampled  into  N 
pixels,  each  of  which  has  been  quantized  into  n  levels  in  the  range  do,di , 
d„.j.  Figure  3-3  shows  the  histogram  of  this  image. 
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Number  of  pixels 


Figure  3-3:  Image  histogram 

The  function  h(dk)=  The  number  of  pixels  with  a  gray  level  equals  dk 
and  is  written  as  : 

h(dJ=Nk  <3-4> 

where,  dk  is  the  gray  level  and  Nk  is  the  number  of  pixels  with  a  gray 
level  =  dk. 

3.5  Cumulative  Histogram 

The  cumulative  histogram  is  another  variation  of  the  histogram  in 
which  the  vertical  axis  gives  not  just  the  number  of  the  pixels  at  that  gray 
level,  but  rather  gives  the  number  of  the  pixels  at  that  level  plus  the  number 
of  pixels  with  smaller  values  of  gray  level. 

Using  the  same  assumptions  as  in  the  last  section,  the  cumulative 
histogram  of  the  image  is  shown  in  Figure  3-4. 
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Number  of  pixels 


Figure  3-4:  Cumulative  histogram 


The  function  H(dk)  =  The  number  of  pixels  with  a  gray  level 
equal  to  or  less  than  dk.  Hence, 

k  k 

H(d,)  =  'YJh(dl)  =  'YJNl  ( 3-5) 

i= 0  i  = 0 


(3-6) 


Both  histogram  and  cumulative  histogram  are  step  functions. 

The  cumulative  histogram  H(dk)  increases  from  0  to  N,  being  the 


number  of  pixels  in  the  image,  since 


n—  1 

YjN'=N- 


i=  0 
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3.6  Invariant  Moments 


In  general,  the  moments  of  a  function  are  commonly  used  in 
probability  theory.  However,  several  desirable  properties  that  can  be  derived 
from  moments  are  also  applicable  to  image  analysis. 


Definition :  The  set  of  moments  of  a  bounded  function  f(x,y)  of 
two  variables  is  defined  by: 


oo  oo 


—oo  —oo 


xJykf(x,y )  dxdy 


(3-7) 


where,  j  and  k  take  on  all  nonnegative  integer  values. 

As  j  and  k  take  on  all  nonnegative  integer  values,  they  generate  an 
infinite  set  of  moments.  Furthermore,  this  set  is  sufficient  to  specify  the 
function  f(x,y j  completely.  In  other  words,  the  set  {Mjk}  is  unique  for  the 
function  f(x,y),  and  only  f(x,y)  has  that  particular  set  of  moments. 

The  parameter  j+k  is  called  the  order  of  the  moment.  There  is  only 
one  zero-order  moment, 


M 


00  _ 


00  00 


f  (x,  y)  dxdy 


—00  —oo 


(3-8) 
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There  are  two  first-order  moments  and  correspondingly  more 


moments  of  higher  orders. 

The  coordinates  of  the  center  of  gravity  of  an  object  are: 


x 


M 

~M 


01_ 

00 


where, 


Moo  = 


f(x,y)dxdy 


(3-9) 


(3-10) 


(3-11) 


x  / (x,  y  )  dxdy 


(3-12) 


y  f  (x,  y )  dxdy 


(3-13) 


3. 7  Spatial  Moments  of  Binary  Images  and  Level  Sets 

The  spatial  moments  of  an  object  in  an  image  are  statistical  shape 
measures  that  give  statistical  measures  related  to  an  object’s  characteristics. 


21 


The  zero-order  spatial  moment  is  computed  as  the  sum  of  the  pixel 
brightness  values  in  an  image.  In  the  case  of  binary  image,  this  is  simply  the 
number  of  pixels  in  the  object,  because  every  object  pixel  is  equal  to  1 
(white).  Therefore,  the  zero  order  spatial  moment  of  a  binary  object  is  its 
area.  For  a  gray-scaled  image,  an  object’s  zero-order  spatial  moment  is  the 
sum  of  its  pixel  brightness. 

The  first  order  spatial  moments  of  an  object  contain  two  independent 
components  x  and  y.  They  are  the  x  and  y  sums  of  the  pixels  brightness  in 
the  object,  each  multiplied  by  its  respective  x  or  y  coordinate  location  in  the 
image. 

In  the  case  of  a  binary  image,  the  first-order  x  spatial  moment  is  just 
the  sum  of  the  x  coordinates  of  the  object’s  pixels,  because  every  object 
pixel  is  equal  to  1  (white).  Likewise,  the  y  spatial  moment  is  the  sum  of  the 
y  coordinate  of  the  object’s  pixels.  For  a  gray-scaled  image,  an  object’s  first 
order  spatial  moments  are  as  defined  above.  The  first-order  spatial  moments 
of  an  object  represent  the  object’s  mass  and  how  it  is  spatially  distributed. 

The  two  most  common  image  object  measurements  that  use  spatial 
moments  are  object  area  and  center  of  mass  (a.k.a  centroid).  As  stated 
above,  an  object’s  area  is  computed  as  it’s  zero-order  spatial  moment.  An 
object’s  center  of  mass  can  be  computed  as  the  first-order  spatial  moments 

(x  and  y)  divided  by  the  zero  order  moment,  or  the  object  area. 
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There  are  two  forms  of  the  center  of  mass,  one  that  considers  pixels  to 
have  uniform  weight,  as  in  a  binary  image,  and  one  that  weights  pixels  based 
on  their  brightness  values.  The  second  form  considers  pixels  that  are  black 
to  have  a  weight  =  0,  those  that  are  white  to  have  a  weight  =  255,  and  pixels 
with  brightness  in  between  to  have  a  weight  corresponding  to  their 
respective  gray-levels. 

The  definitions  for  the  center  of  mass  measures  are  as  follows: 

Brightness- Weighted  Center  of  Mass: 

The  balance  point  (x,y)  of  the  object  where  there  is  equal  brightness 
above,  below,  left,  and  right.  If  we  think  of  the  pixels  in  an  object  as  having 
a  weighted  dependent  upon  their  brightness,  then  the  brightness  weighted 
center  of  mass  is  the  point  where  the  object  will  perfectly  balance  on  the  tip 
of  a  point,  as  shown  in  Figure  3-5. 


23 


Center  of  mass 


Figure  3-5:  Center  of  mass  of  a  gray-scale  image 


Center  of  Massx 
Center  of  Massy 


Sum  of  objects  x-pixel  coordinates  x  pixel  brightness 
Number  of  pixels  in  object 

Sum  of  objects  y-pixel  coordinates  x  pixel  brightness 
Number  of  pixels  in  object 


For  a  binary  image,  the  pixel  brightness  will  be  equal  to  1.  So,  for  a 


binary  image: 


^  Sum  of  objects  x-pixel  coordinates 

Center  of  Massx  = - - - - - 

Number  of  pixels  in  object 


Center  of  Massy 


Sum  of  objects  y-pixel  coordinates 
Number  of  pixels  in  object 


Figure  3-6  shows  the  center  of  mass  for  a  binary  object. 
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Figure  3-6:  Center  of  mass  of  a  binary  image 


For  an  NxM  gray-scaled  image,  equation  (  3-7)  can  be  changed  to 
discrete  version  as  follows: 


M  -1  N  -1 


M]k  = 


=  (x,y) 


(3-14) 


y  =0  x=0 


And  for  an  NxM  binary  image,  equation  (  3-7)  reduces  to: 


M , 


M- 1  N- 1 

IX 

y= 0  x=0 


x'ykS(f(x,y)-\) 


(3-15) 


Equation  (  3-8)  can  also  be  changed  to  the  following: 


M  -1  N- 1 


(3-16) 


y  =0  x  =0 


Likewise,  equation  (  3-9)  can  be  changed  as  follows: 
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(3-17) 


M 


10 


M- 1  N  - 1 

-II  *f  (x  >y ) 


y  =  0  x  =0 


And,  equation  (3-10)  can  be  changed  as  follows: 


M  -1  JV-1 


M01  =  SI  yf  (X,y) 


(3-18) 


v  =0  x=0 


Finally,  the  center  of  gravity  of  an  image  will  be: 


M 


M- 1  A-l 

II  xf  (x,y) 


to  _  y=o  x=o 


M 


oo 


II 

y  =0  x  =0 


(3-19) 


f  (x,y) 


y  = 


^01 


M  -1  A-l 

II  yf  (x,y) 

y  =0  x  =0 

M  -1  y-T 

II/<"’,,) 

y  =0  x  =0 


(  3-20) 


We  can  also  compute  higher-order  spatial  moments.  For  instance,  the 
second-order  moments  produce  object  orientation  information.  Spatial 
moments  of  an  order  that  is  greater  than  two  produce  abstract  information 
that  is  difficult  to  tie  specifically  to  physical  object  characteristics. 
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3.8  Motion  Analysis 


3.8.1  Image  Translation 

The  basic  model  of  disparity  between  two  images  is  translation. 
Translation  is  used  to  move  regions  of  an  image  intact  to  other  locations 
within  the  image.  Typically,  it  indicates  that  an  object  in  the  foreground  has 
moved.  If  the  translation  operations  moves  a  region  outside  the  area  defined 
by  the  original  image,  then  a  new  image  must  be  created  that  encompasses 
the  original  image  plus  the  translated  region.  Image  translation  is  defined  as 
follows: 


X 

new 

_ 

1 

o 

X 

+ 

AX 

_y  new 

1 

o 

_AT  _ 

where,  xM  yQid  are  the  pixel  coordinates  of  an  arbitrary  point  in  the  region  to 
be  translated;  and,  xnew,  ynew  are  the  coordinates  of  its  location  after  of  the 
translation  is  complete.  The  values  Ax,  and  Ay  define  the  amount  of 
translation  in  the  x  and  y  directions,  respectively.  For  each  pixel  within  a 
region  to  be  translated,  Equation  (  3-21)  is  applied  to  produce  a  new  set  of 
translated  coordinates.  In  translating  a  region,  the  original  image  is  first 
copied  to  the  output  image  and  then  the  region  to  be  translated  is  moved  to 
its  new  position  within  the  image  using  Equation  (  3-21).  If  the  pixels  within 
the  original  region  to  be  translated  are  left  unchanged,  the  translation  process 
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becomes  equivalent  to  an  image  copy.  If,  on  the  other  hand,  the  original 
region  to  be  translated  is  filled  with  a  constant  gray  level  (erased),  the 
translation  operation  becomes  equivalent  to  a  move  operation.  Figure  3-7 
shows  an  example  of  a  translation. 

0  - x - ►  N-1  0  - x - ►  N-1 

0 


► 

▼ 

M-1 

Figure  3-7:  Translation  example 

Translation  by  integer  pixel  values  is  straight  forward.  However, 
translation  by  subpixels  must  be  realized  using  bilinear  interpolation. 


M-1 


28 


3.8.2  Image  Rotation 


Rotation  is  one  of  the  fundamental  models  of  linear  spatial 
transformations  between  two  images.  It  is  characterized  by  two  parameters: 
center  of  rotation,  and  the  rotation  angle. 

Consider  a  counter-clockwise  rotation  of  the  camera.  The  net  effect  is 
a  clockwise  rotation  of  all  pixels  to  a  new  location. 


X 

new 

— 

_y  new 

cos# 

sin# 


-sin# 

X  old 

cos# 

y  old  _ 

where,  6  is  the  angle  of  rotation. 
Further  analysis  will  indicate  that: 


y» 


COS# 

-sin# 

Xold 

sin# 

cos# 

_y  oM  _ 

( 3-22) 


( 3-23) 


where  the  quantity  x  indicates  an  average  value. 
Then, 


x  —x 

new  new 

cos# 

-sin# 

1 - 

l*° 

i 

i _ 

_  y new  y new  _ 

sin# 

cos# 

- 1 

1 

o 

PS 

It  is  often  convenient  and  more  desirable  to  analyze  and  characterize  the 
motion  of  individual  objects  in  the  scene,  including  their  observed 
rotation(s).  The  expression  (3-24)  above  facilitates  such  a  mechanism. 


29 


From  (3-22)  and  (3-2)  we  conclude  that: 


X new 

cos  6 

-sin# 

Xold  Xold 

+ 

_  y new  _ 

sin# 

cos# 

y0id  ~  y  oM  _ 

Figure  3-8 


(3-25) 


Xold 


(b) 


(a)  <->  (b) 

Camera  rotation  centered  at  (0,0) 


(a)  (c) 

Object  "A"  has  rotated  and  then  translated  to  A', 


Example 


Composite  motion  comprised  of  both  geometrical  translation  and 
rotation  of  a  region  within  an  image  about  its  geometrical  center  Mx>  My,  is 
expressed  as: 


X 

new 

cos  # 

-sin# 

Xo,d 

X' 

u 

y new  _ 

sin# 

cos# 

1 

_y old  _ 

X. 

old  J 

M 

(3-26) 


The  geometrical  center  (centroid)  of  the  region  as  we  have  seen  before 
is  given  by: 

1  N 

Mx=^YuXi  (3-27) 

and 

1  N 

(3-28) 

where,  x,  and  y,  are  the  coordinates  for  each  pixel  in  the  region  to  be 
translated  and  the  parameter  N  is  defined  as  the  number  of  pixels  within  the 
region  being  translated. 

Equation  (  3-26)  can  also  be  used  to  rotate  an  entire  image  about  the 
particular  point  x0,  y0  by  setting  Mx=  xa,  My=  yQ.  Once  the  rotation  is 
completed,  the  image  is  then  translated  back  to  its  original  position  xf),  yQ. 
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3.8.3  Image  Scaling 


Another  common  type  of  geometrical  operation  is  that  of  scaling. 
Scaling  provides  a  means  of  reducing  or  enlarging  the  size  of  an  image. 
Desired  regions  within  an  image  can  magnified  to  spatially  enlarge  features 
that  would  otherwise  be  difficult  to  observe.  Geometrical  image  scaling  is 
defined  mathematically  in  equation  (  3-29) 

xnew~\  r-,  °ir*oW 

=  A  (3-29) 

y  U  cr  y 

_y  new  J  |_  y  J  \_y  old 

To  scale  a  total  image,  xokl,  y0id  are  defined  over  the  coordinates  of  the 
entire  image,  and  for  region  scaling  xM  yold  are  defined  by  the  pixels  within 
the  region  to  be  scaled.  For  ox  and  ay  >  1,  the  output  image  will  be  an 
enlarged  version  of  the  input  image,  while  for  ox  and  oy<  1  the  scaled  output 
image  is  a  reduced  version  of  the  input  image.  For  either  ox  or  ay  negative, 
the  image  is  rotated  about  the  axis  of  the  negative  scaling  parameter.  For 
example  if  ox=-3,  oy=l ,  the  image  is  increased  by  three  and  is  flipped  about 
the  x  axis.  Geometric  scaling  in  particular  requires  the  use  of  interpolation 
prior  to  scaling  an  image.  Interpolation  will  be  discussed  later  in  this 
chapter. 
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3.8.4  Image  Skewing 


The  next  basic  model  of  shape  change  or  disparity  is  skewing 
(deformation)  or  shear  change.  Figure  3-9  shows  an  image  of  a  rectangle  that 
has  been  skewed  to  the  right  in  the  x  direction  by  an  angle  of  a.  Figure  3-10 
shows  the  same  image  skewed  to  the  lower  direction  of  the  y-axis  by  an 
angle  of  a. 

The  skewing  geometrical  transformation  is  defined  by 


X 

new 

— 

_y  new 

cos  a 
sin  a 


sin  a 
cos  a 


x 


old 


y 


old 


where,  a  is  the  deformation  angle. 


( 3-30) 


0  - x - ►  N-1 


0 


N-1 


Figure  3-9:  Image  deformation  to  the  right  in  the  x-axis  direction 
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N-1 


N-1 


Figure  3-10:  Image  deformation  to  the  lower  direction  of  y-axis 

Suppose  an  ellipse  shaped  disc  were  to  be  rotated  by  an  axis  parallel 
to  its  surface,  whose  orientation  is  not  parallel  to  the  major  or  minor  axes, 
the  resulting  new  contour  will  exhibit  a  shape  conducive  to  be  analyzed  by 
this  model. 
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3.9  Image  Interpolation 


A  large  number  of  geometric  transformations,  such  as  translation, 
rotation,  and  shearing  will  map  pixels  to  a  new  position  that  is  no  longer  an 
integer  and  so  not  on  the  original  sampling  grid.  Figure  3-11  illustrates  that  a 
rotation  of  the  image  requires  the  evaluation  of  intensity  at  points  that  were 
not  on  the  original  grid. 


Figure  3-11:  Illustration  that  a  rotation  of  the  image  requires  interpolation 


Interpolation  is  a  process  of  generating  a  value  of  a  pixel  based  on  its 
neighbors.  Neighboring  pixels  contribute  a  certain  weight  to  the  value  of  the 
pixel  being  interpolated.  This  weight  is  often  inversely  proportional  to  the 
distance  at  which  the  neighbor  is  located. 
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There  are  several  different  types  of  interpolation  methods.  Nearest 
neighbor  interpolation  is  the  simplest  method  and  basically  makes  the  pixels 
bigger.  The  value  of  a  pixel  in  the  new  image  is  the  value  of  the  nearest  pixel 
of  the  original  image.  The  other  interpolation  methods  also  include  bilinear 
interpolation  and  bicubic  interpolation.  The  interpolation  method  that  is  used 
in  our  DIS  is  bilinear  interpolation.  Bilinear  interpolation  determines  the 
value  of  a  new  pixel  based  on  a  weighted  average  of  the  4  pixels  in  the 
nearest  2x2  neighborhood  of  the  pixel  in  the  original  image.  Figure  3-12 
shows  four  neighboring  pixels  surrounding  the  pixel  (x,y)  to  be  interpolated. 


Figure  3-12:  Bilinear  Interpolation 
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In  Figure  3-12,  we  assumed  u  and  v  are  the  integer  parts  of  x  and  y, 


respectively,  bilinear  interpolation  is  defined  by 

f(x,y)  =  WuJ(u,v)  +  Wu+lJ(u  +  l,v)  +  WU'V+lf(u,v  + 1)  +  Wu+hv+J(u  +  l,v  + 1) 

(3-31) 

where, 

Wu,r  =(«  +1“*)(V  +1-JV) 

Wu+Uv  =(x  -u)(y  +  \-y) 

Wuv+l=(u+l-x)(y-v) 

K+1,r+1=(X  ~u)(y  ~v) 

The  bilinear  interpolation  has  an  anti-aliasing  effect  and  therefore 
produces  relatively  smooth  edges. 
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IV.  Methodology 


A  General  method  for  DIS  includes  two  modules:  motion  estimation 


module  and  motion  compensation  module.  The  motion  estimation  module 
calculates  global  motion  vector  of  input  frame  relative  to  reference  frame. 
Then,  the  motion  compensation  module  processes  input  frame  according  to 
motion  vector  and  stabilizes  observed  images.  Figure  4-1  shows  a  block 


diagram  of  such  a  system. 


Figure  4-1:  DIS  Model 

With  the  advantage  of  low  energy  consumption,  light  weight  and 
compact  size,  DIS  technique  offers  excellent  performance  in  the  case  of  low 
frequency  and  small  amplitude  system  vibrations. 


4.1  Motion  Estimation  Module 

The  DIS  proposed  in  this  thesis  is  based  on  the  following  assumptions 
that:  each  frame  in  the  given  image  sequence  is  distinct,  and  the  image 
instability  is  the  result  of  translation,  rotation,  skewing  and  scaling  between 
frames. 
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Through  analyzing  image  frames,  the  motion  vectors  (including 
amounts  of  translation,  rotation  and  scaling),  which  are  the  basis  of 
compensation  processing,  can  be  calculated.  Motion  estimation  between 
frames  is  usually  based  on  a  rigid  motion  model  as  follows: 


X 
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_ 

0  " 
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cos  # 

-sin# 

X  old 
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_y  new  _ 
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cos# 
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_AT_ 

(4-1) 

The  above  given  model  is  explained  in  the  following  text.  In  the 
formula,  xnew,  xokl  are  horizontal  coordinates  of  corresponding  pixels  in  input 
frame  and  reference  frame;  ynew,  y0id  are  vertical  coordinates  of 
corresponding  pixels  in  input  frame  and  reference  frame;  Ax,  Ay  are 
translation  amounts  between  two  frames;  0  and  a  are  the  rotation  and 
deformation  angles  between  two  frames  respectively.  The  two  factors  ax,  oy 
are  the  scaling  factors. 

Equation  (4-1)  can  be  rewritten  as  follows: 


^  new 

=  A 

X  old 

+ 

AT 

_y  new  _ 

_y  ou  _ 

_AT  _ 

where,  A  is  a  sequence  of  rotation,  scaling  and  angular  deformation. 
And  it  can  be  decomposed  in  the  form: 
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Matrix  A  is  a  4x4  matrix.  So,  it  is  in  the  form: 


A  = 
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Hence, 


an 
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cos# 

-sin# 

_a2\ 

a22_ 

0 
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sin  a 

cos  a 

sin# 

cos# 
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cos(a  -  #)  sin(a  -  0) 
sin(«  +  0)  cos(«  +  0) 


gx  cos(a  -  0)  <jx  sin(a  -  0) 
Gy  sin(a  +  0)  ay  cos(or  +  9) 


By  solving  Equation  ( 4-3), 


(4-3) 


/  2,2 

<7*  —  -\la  n+a  12 

(4-4) 

/  2  2 
~  21 22 

(4-5) 

(a  -  9)  =  atan2(a12  ,au) 

(4-6) 

{a  +  0)  =  atan2(a21  ,a22 ) 

(4-7) 

40 


To  find  the  values  of  a  and  6,  we  need  to  solve  Equation  (  4-6)  and 
Equation  ( 4-7)  simultaneously, 

2  a  =  atan2(«12  ,au)  +  atan2(a21 ,  a22 ) 

_  atan2(a12,au)  +  atan2(a21,a22) 

a-  2  (4'8) 

By  substituting  the  value  of  a  into  Equation  (  4-7), 

atan2(a12  ,an)~  atan2(a2 1 ,  a22 ) 

a  = - 2 -  (4'9) 

Now,  we  have  six  variables  to  estimate,  these  values  are  show  in  Table  4-1. 


Table  4-1:  Motion  vectors  variables 


Motion  Vectors 

Description 

Ox 

The  scaling  factor  in  x  axis 

Oy 

The  scaling  factor  in  y  axis 

e 

Rotation  angle 

a 

Deformation  angle 

Ax 

Translation  in  x  axis 

Ay 

Translation  in  y  axis 

Our  aim  is  to  estimate  the  elements  of  A  and  the  translation  vector 
(Ax, Ay)  from  two  given  images.  Since  ax,  ay,  a,  and  6  are  functions  of  the 
elements  of  A,  then  it  is  sufficient  to  find  the  value  of  A  to  get  the  values  of 
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ax,  Gy,  a,  and  6.  Because  we  have  six  unknowns,  then  we  need  six  equations 
to  be  solved  simultaneously. 


Assume  x_ n  is  a  feature  point  in  an  image  at  time=/  where  n  is  the 
image  number.  And  assume  x_ '  is  the  same  feature  point  in  the  same  image 

at  time=t+7  where  n  is  the  image  number.  We  have  agreed  before  in 
Equation  (3-30)  that: 

*!,  =^Xn+C  (4-10) 


where, 


xll= 

,  — n  = 

X„ 

and  C  ~ 

AX 

LynJ 

_yn_ 

_Ay_ 

In  order  to  estimate  the  value  of  A  and  C,  we  need  6  images;  that  is 
three  images  at  time=t  and  the  same  3  images  but  at  time=/+7.  This  can  be 
written  mathematically  as  follows 


x[  -Ax_ j  +C 


—  Ax_  2  +C 


X3  -  Ax_s  +  C 


These  equations  can  also  be  expanded  to  the  following  equations: 

x[=anx  j  +  auy  {  +ax 
y[=a2lxl+a22yl+Ay 
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X  2  —  jX  2  +  CL^2 y  2  "I" AX 
y  2  =«21X2+a22^2+A3; 

X  3  =  anX  3  +  ai2J  3  +AX 
T3  =«21^3+«22J;3+A^ 

These  equations  can  be  solved  simultaneously  to  find  the  values  of  an,  a12, 
a 21 ,  a22,  Ax  and  Ay.  The  above  computation  can  be  expressed  in  the  form  of 
matrix  algebra  as  follows: 
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y'3  _ 

x3  y3  1 

_*y_ 

Which  is  of  the  form: 

p  =  Pa 

The  above  equation  reveals  several  important  facts.  First,  a  minimum  of 
three  points  must  be  known  in  each  image.  Second,  these  points  should  not 
be  collinear.  If  they  are  collinear,  then  P  can  not  be  inverted.  When  more 
than  three  points  are  known  in  the  images,  then  standard  pseudo  inverse 
computation  computes  an  optimal  estimate  of  a  such  that: 

a  =  (P‘Py1Ptp 
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4.1.1  Image  Segmentation 


Segmentation  is  the  process  of  partitioning  an  image  into  regions  or 
subimages.  The  region  or  the  subimage  here  is  defined  as  a  group  of  pixels 
with  similar  properties.  These  properties  include  same  graylevel  or  textures 
...  etc.  We  will  use  graylevel  as  the  property  to  distinguish  between  the 
subimages.  The  simplest  representation  of  a  segment  is  a  binary  valued 
image,  where  each  pixel  is  assigned  a  value  ‘  1  ’  if  it  is  in  the  region,  and  a  ‘O’ 
otherwise. 

Segmented  images  must  satisfy  the  following  two  properties: 

1 .  Distinctness  : 

No  pixel  is  shared  by  two  regions.  That  is 

Rt  =0  for  i,j  =  1,  ... ,  k; 

i  ^  j 

where,  R  is  a  subimage  and  k  is  the  maximum  number  of  subimages 
intended  to  create. 

2.  Completeness : 

All  pixels  in  the  image  must  be  assigned  to  one  of  the  k  regions. 
That  is 

=/ 

where,  I  is  the  original  image  intended  to  be  segmented. 


44 


The  first  property  states  that  regions  are  disjoint  sets  and  the  second 
property  states  that  the  entire  image  I  must  be  covered  by  the  regions  R,  , 
i=l,  2,  k. 

One  of  the  simplest  methods  to  segment  an  image  is  to  apply 
thresholding.  Thresholding  is  a  method  for  image  segmentation.  The 
cumulative  histogram  of  the  image  is  used  to  determine  the  proper  value  of 
the  threshold.  The  general  equation  to  create  some  binary  images  from  a 
gray-scaled  image  can  be  written  as  follows: 


1  f(x,y)<kn 
0  f(x,y)>kn 


(4-12) 


where,  Bn  is  a  binary  image,  k„  is  the  threshold  value  used  in  the 
segmentation  to  create  this  binary  image  and  f(x,y)  is  a  gray-scaled  image. 
One  could  iteratively  try  to  determine  the  best  threshold  kn  by  a  systematic 
trial  and  error  process.  Also,  well  established  decision  techniques  can  be 
applied  to  estimate  an  optimal  threshold  kn,  when  the  parametric  model  of 
the  underlying  distribution  (histogram)  is  known. 

In  our  work,  we  have  chosen  a  level-sets  based  approach  to  selecting  up  to 
six  thresholds  to  divide  image  into  six  binary  images.  This  approach  is 
explained  in  the  following  text. 

In  order  to  determine  the  proper  value  of  kn  ,  the  image  will  be  divided  into 

regions  according  to  the  gray  levels.  The  cumulative  histogram  is  a  useful 
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tool  to  determine  the  threshold  values  needed  in  the  segmentation.  To  find 
the  best  threshold  values,  the  y-axis  of  the  cumulative  histogram  which 
represents  the  number  of  pixels  should  be  divided  into  the  same  number  of 
subimages  needed  to  create.  In  our  DIS  algorithm,  the  number  of  binary 
subimages  is  six.  So,  the  y-axis  should  be  divided  into  six  portions. 

Assume  we  have  a  cumulative  histogram  of  an  image  plotted  in 
Figure  4-2. 


Number  of  pixels 


Figure  4-2:  Cumulative  Histogtam  of  an  image 


46 


In  this  example,  six  binary  images  can  be  created  by  using  a  modified 


version  of  Equation  (4-12)  as  follows: 


1  kn_x  <f(x,y)<kn 
0  kn<f(x,y)<kn_l 


The  threshold  values  (Ay,  k2,  k3,  Ay,  k5,  and  Ay)  can  be  calculated  as  follows: 


kn=H~\Dn) 


where, 


D 


n 


n  xN 
m 


where,  n  is  the  threshold  number  and  it  can  take  values  from  1  to  the  number 
of  binary  subimages,  at  least  six  in  this  context.  N  is  the  total  number  of 
pixels  in  the  image.  And  m  is  the  desired  number  of  binary  subimages. 

The  segmentation  process  discussed  earlier  will  result  in  6  binary 
subimages.  All  these  subimages  will  be  used  to  estimate  the  value  of  motion 
vectors  ( ax ,  ay,  a,  6,  Ax,  and  Ay).  Among  the  six  subimages,  each  time  we 
will  use  3  subimages  to  estimate  the  motion  vectors.  So,  the  resultant 
number  of  motion  vectors  will  be: 


6! 

(6  —  3)  !x  3 ! 


=  20 


Hence,  we  will  have  20  motion  vectors.  A  question  arises  here,  which  one  of 
these  motion  vectors  should  be  used? 
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To  determine  the  best  motion  vector,  we  are  using  a  statistical  tool 
called  “clustering”.  To  prepare  our  data  to  be  clustered  and  then  ready  for 
analysis,  the  6  and  a  values  should  be  plotted  in  x-axis  and  y-axis 
respectively.  Then,  a  clustering  technique  will  be  used  to  analyze  these  data. 
The  following  section  explains  in  detail  the  idea  of  “clustering”  and  why  we 
need  it  here. 

4.1.2  Data  Clustering 

Clustering  is  a  classical  topic  in  statistical  data  analysis  and  machine 
learning.  There  is  much  research  work  discussing  clustering  methods  [5].  It 
is  defined  as  the  process  of  grouping  a  set  of  objects  into  classes  of  similar 
objects.  We  can  show  this  with  a  simple  graphical  example  as  in  Figure  4-3. 
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The  most  well  known  similarity  measures  are  based  on  distances,  such 
as  Euclidean  distance  and  Manhattan  distance.  There  are  many  algorithms 
can  be  used  to  implement  data  clustering.  In  this  thesis,  a  graph  theoretic 
algorithm  will  be  used  to  do  the  job.  Graph  theoretic  algorithm  for  clustering 
is  a  technique  based  on  modified  KruskaTs  algorithm.  The  purpose  is  to  take 
the  advantage  of  the  simplicity  of  tree  structure,  which  can  facilitate  efficient 
implementations  of  much  more  sophisticated  clustering  algorithms.  There 
are  many  variations  in  the  family  of  graph  theoretic  algorithms,  such  as 
Minimal  Spanning  Tree  (MST)  based  method,  Cut  algorithm,  and 
Normalized  Cut/Spectral  methods.  In  general,  the  idea  of  graph  theoretic 
algorithms  is  the  following:  firstly,  it  constructs  a  weighted  graph  upon  the 
points  in  the  high-dimensional  space,  with  each  point  being  a  node,  we  will 
use  {6,  a)  as  nodes  and  the  distance  value  between  any  two  nodes  being  the 
weight  of  the  edge  connecting  the  two  nodes.  Then,  it  decomposes  the  graph 
into  connected  components  in  some  way,  and  calls  those  components  as 
clusters  or  forests.  We  mainly  focus  on  an  MST-based  clustering  algorithm 
using  KruskaTs  algorithm.  KruskaTs  algorithm  is  used  to  create  minimum 
spanning  tree  and  it  works  as  follows: 

1 .  Consider  the  edges  from  shortest  to  longest. 

2.  Take  the  first  (smallest)  edge  and  then  consider  the  next  edge. 

3.  Take  an  edge  if  it  does  not  make  a  cycle. 
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4.  If  you  still  have  edges  then  go  to  step  1. 

In  our  case  here,  we  will  not  continue  to  create  a  full  minimum 
spanning  tree  because  this  is  not  our  goal.  Instead,  we  will  stop  whenever  we 
have  an  optimum  cluster  which  satisfies  our  conditions.  The  conditions  we 
set  to  be  satisfied  are  two: 

a.  The  cluster  (forest)  should  contain  all  the  subimages. 

b.  The  distance  between  any  two  nodes  in  the  cluster  should  not  exceed 
10%  of  the  largest  distance  between  the  nodes. 

Eventually,  we  will  get  a  cluster  with  some  nodes.  Assume  F  is  the 
chosen  cluster,  then  ¥={nh  n2,  ...,  n„},  where,  nt  is  a  node  in  {6, a)  graph. 
We  know  that  every  node  here  is  calculated  using  3  subimages  as  we 
have  seen  earlier. 

Because  every  node  is  constructed  by  three  subimages,  then  assume: 

ni  =  (pi,  qi,  ri) 
n2  =  (P2,  q2,  r2) 

n„  =  (Pn,  qn,  r„) 

where,  p„  qt  and  rh  are  subimages. 

For  every  node  nh  the  size  of  the  subimage  with  the  minimum  size 
(w)  will  be  used  to  calculate  the  final  motion  vectors  as  follows: 
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a  - 


(4-13) 


yv 


y  (w,-4) 

*  =  ^ -  (4-14) 


i=i 


Af  = 


(4-15) 


(4-16) 


(4-17) 


Av  = 


(4-18) 


where, 


w  f  =  min  (s/ze  (/?,.  ),size(qi  ),size(ri )) 

i  =\...n 
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At  the  end  of  this  phase  of  the  DIS  algorithm,  we  have  the 
estimated  motion  vectors  (ax,  oy,  a,  6,  Ax,  and  Ay)  which  are  necessary  in 
the  next  phase  which  is  “Motion  Compensation”. 


52 


4.2  Motion  Compensation  Module 


The  result  of  the  motion  estimation  process  described  in  the  last 
section  is  capable  of  computing  the  motion  vectors  between  two  frames.  The 
objective  of  motion  compensation  is  to  keep  some  kind  of  history  of  the 
motion  estimates  in  order  to  create  a  stabilized  sequence.  We  have  seen  that 
the  DIS  proposed  is  based  on  a  hypothesis  that  the  image  instability  in  image 
sequence  is  the  result  of  translation,  rotation,  skewing  and  scaling  between 
frames.  So,  by  knowing  these  motion  vectors  which  are  estimated  in  the  last 
section,  an  image  can  be  constructed. 

An  image  can  be  constructed  using  the  hypothesis  in  Equation  (4-10): 

X, 'n  =AXn+C 


where,  A  as  calculated  in  the  last  section: 


gx  cos  (a  -  0)  <jx  sin(a  -  0) 
<jy  sin(a  +  0 )  ay  cos(a  +  0 ) 


and 


C  = 


AX 

Ay 


As  we  know  already,  pixels  of  an  image  occupy  integer  coordinates. 
We  can  note  from  Equation  (4-10)  that  the  destination  pixels  may  lie 


53 


between  the  integer  coordinates.  So,  in  order  to  create  an  image  from  these 
pixels,  destination  pixels  are  interpolated  at  the  integer  coordinates. 

4.3  Frequency  Domain  Approach  To  Estimate  Image  Translation 

This  section  introduced  the  Fourier  transform  based  approach  to 
estimate  image  translation  between  two  images. 

4.3.1  Introduction 

So  far  in  this  thesis,  we  have  considered  only  one  kind  of  image 
representation.  For  the  most  part,  the  images  have  recorded  brightness  as  a 
function  of  position.  In  practice,  there  are  many  different  ways  in  which 
image  data  can  be  presented.  These  changes  of  representation  are  useful  to 
analyze  certain  characteristics  of  the  images  more  efficiently.  Some  kinds  of 
transformation  produce  representations  which,  although  different  from  the 
original  data,  are  completely  equivalent  to  it  in  terms  of  the  information 
contained.  These  so-called  domain  transforms,  of  which  the  Fourier 
transform  is  by  far  the  most  important,  allow  images  to  be  treated  in  ways 
entirely  different  from  those  used  on  the  original  data. 
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4. 3. 2  Transforming  Domain 


Domain  transforms  provide  alternative  ways  of  describing  an  image. 
Instead  of  recording  brightness  as  a  function  of  position,  we  can  choose  a 
completely  different  presentation  of  the  image. 

In  Fourier  transform,  the  image  is  stored  as  a  set  of  spatial  frequency 
values  together  with  their  associated  amplitudes  and  phase.  The  point  is  that 
instead  of  brightness  as  a  function  of  position,  the  Fourier  representation  is 
a  complex  valued  function  of  spatial  frequency.  In  the  frequency  domain, 
changes  in  image  position  produce  a  noticeable  changes  in  the  phase  of  each 
spectral  component.  The  phase  change  can  be  quantitively  measured,  and 
used  to  characterize  the  motion. 


4.3.3  Fourier  Transform  of  an  Image 

As  we  are  only  concerned  with  digital  images,  we  will  use  the 
Discrete  Fourier  Transform  (DFT).  The  DFT  is  the  Fourier  Transform 
variation  used  in  digital  processing. 

The  Fourier  transform  of  a  MxN  image  is  shown  mathematically  as: 

H(u,v)  =  YJYJh(u,v)e  ^ M  N\  -71<0<71  (4-19) 

x=0  y=  0 


where,  h(x,y)  is  the  image  to  be  transformed  and  H(x,y)  is  the  transformed 
one. 
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It  is  also  possible  to  transform  image  from  the  frequency  domain  back 
to  the  spatial  domain.  This  is  done  with  an  inverse  Fourier  transform  as 
follows: 


h(x,y)  =  YJ<£H(u,v)e 


x=0  y= 0 


(4-20) 


In  the  frequency  domain,  u  represents  the  spatial  frequency  along  the 
original  images  axis  and  v  represents  the  spatial  frequency  along  the  y  axis. 
In  the  center  of  the  image  u  and  v  have  their  origin. 

The  Fourier  transforms  deals  with  complex  numbers.  The  magnitude 
is  expressed  as: 


I  H(u,v  )|  =  ^R2(u,v)  +  l2(u,v) 


(4-21) 


and  phase  as: 


0(u,v)  =  tan  1 


~  I  (u,v)~ 

R(u,v ) 


(4-22) 


where,  R(u,v)  is  the  real  part  and  I(u,v)  is  the  imaginary. 

The  frequency  is  dependent  on  the  pixel  location  in  the  transform.  The 
further  from  the  origin  it  is,  the  higher  the  spatial  frequency  it  represents. 

The  computation  of  the  Fourier  transform  stores  the  real  and 
imaginary  spectral  components  in  arrays,  starting  with  positive  frequency 
values  followed  by  negative  frequency  values.  Figure  4-4  shows  an  example 
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of  the  magnitude  spectrum  from  a  one-dimensional  DFT,  showing  that  the 
negative  frequency  components  follow  the  positive  frequency  components. 


Figure  4-4:  Uncentered  magnitude  spectrum 

Normally  when  plotting  spectral  components  using  the  Cartesian 
coordinate  system,  negative  frequency  components  are  plotted  first  followed 
by  the  positive  frequency  components.  The  first  half  and  last  half  of  the 
array  of  Fourier  components  must  be  swapped.  This  can  be  done  by 
multiplying  every  pixels  in  the  image  by  (~l)x+y ,  that  is: 

f(x,y)=>f(x,y)(-iry  (4-23) 

Equation  (  4-23)  is  the  centering  property  of  the  DFT. 

Figure  4-5  shows  the  output  from  the  DFT  after  application  of  the  centering 
property. 


57 


Figure  4-5:  The  Centered  magnitude  spectrum 


Figure  4-6  shows  the  uncentered  magnitude  spectrum  of  an  image 
containing  a  white  object.  Figure  4-7  shows  the  DFT  spectrum  of  the  same 
image  after  application  of  the  centering  property. 


Figure  4-6:  Uncentered  spectrum  of  an  image 
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Figure  4-7:  The  centered  spectrum  after  using  centring  property  of  DFT. 

4.3.4  Translation  Estimation 

In  this  chapter,  a  frequency  domain  method  is  investigated  for 
estimating  the  translation  between  two  images. 

The  motion  between  a  reference  image  and  the  second  is  assumed  to 
be  a  pure  translation.  Considering  such  kind  of  displacement  between  two 
images  the  motion  may  be  simply  described  by  two  parameters:  horizontal 
and  vertical  shifts. 
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In  the  Fourier  transform  domain  relation  between  two  mutually 


shifted  images  can  be  expressed  as  follows: 


/ - (au+bv ) 

f2(x,y)  =  f^(x -a,y  -b)<y>  F^(u,v)e  N  (4-24) 

where,  fi(x,y)  is  the  reference  image  and  f2(x,y)  is  the  shifted  one.  Fj(u,v)  is 
the  Fourier  transform  of  fi(x,y).  It  is  known  from  Fourier  transform 
properties  that  a  spatial  shift  results  in  multiplication. 

What  we  want  here  is  to  find  out  the  values  of  a  and  b  in  equation 
(4-24)  because  they  represent  the  vertical  and  horizontal  shifts. 

We  know  from  equation  (4-24)  that: 


F2(u,v  )  =  F,(uy  )e 


ijj-(su+bv) 


(4-25) 


Dividing  Equation  (4-25)  by  F;(u,v)  : 

F2(U,V )  J^au+bv) 

IT* - 7  =  e  (4-26) 

F,(u,v) 

The  right  hand  side  term  on  this  equation  is  a  complex  number  and  it 
can  be  split  into  two  parts:  the  real  part  R(u, v)  and  the  imaginary  part  S(u,v). 
The  phase  of  this  complex  number  can  be  calculated  as  follows: 

0(u,v)  =  atan2  (S(u,v),R(u,v))  (4-27) 

The  phase  also  can  be  found  as  follows: 


0(u,v)  =  —(au  +bv) 


(4-28) 
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So,  for  every  point  in  the  frequency  domain,  there  is  a  phase  as  in 


Table  4-2. 


Table  4-2:  Phase  of  every  point  in  u-v  plane 


u-v  plane  points 

Phase 

(Ul,Vj) 

0(ui,vi) 

(U2,V2) 

0(u2,v2) 

... 

... 

(uN,VN) 

6( Um,  vn) 

By  finding  the  phase  of  every  point  using  Equation  (4-28)  and  finding 
the  square  mean  error  of  these  phases,  we  get: 


1=1 


9  JT 

— (au;.  +bvi)-0(ul,vi) 


(4-29) 


The  error  should  be  kept  minimal.  So, 


/=  1  L 


2  71 

A T 


-l2 


{aUj  +bvl)-0{ui,vi) 


^minimum 


(4-30) 


8E_ 

da 


=  0, 


8E 

db 


=  0 


(4-31) 


dF  N 

-  =  V2 
da  ^ 


?7T 

— (aui+bvi)-0(ui,vi) 


2  n  n 

.  — Uj  =  0 
N  1 


(4-32) 


dF  N 

—  =  Y2 

db  U 


~N 


(aui  +bvi)-0{ui,vi) 


2  n 

.  — v,  =  0 
N  1 


(4-33) 
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(4-34) 


N 

I 

/  =  1 


(2/r) 

2 

N  ( 

2 

2tc ^ 

In  J 

aU, 

/=  1  v 

,N) 

y^r  ^ 
A/  '  ' 


if— 

MW; 


i7(v,a+2 


/=1 


^2;r^ 

vaT, 


.2  v>  2;r 
bv  i  =  >  — v;# 

h  n  1 1 


(4-35) 


2 

—  /  ui 

N  U  ' 

2  n  ^ 

-  >  U:V: 

N  U  ‘  ' 

a 

N 

2>,-  °i 
/=i 

In  ^ 

—  /  u,v. 

In  u 

2n  4^  2 

—  /  vi 

n  tr  \ 

b  ~ 

N 

_/=  1 

(4-36) 


The  last  equation  gives  the  values  of  a  and  b  which  we  are  looking  for. 


4.3.5  Rotation  and  Scaling  Estimation 

The  underlying  computations  to  estimate  rotation  and  scaling  in 
Fourier  domain  are  not  as  simple  as  in  translation  estimation.  The 
complexities  of  the  model  poses  a  serious  question  about  the  efficiency  of 
finding  rotation  and  scale  change  in  frequency  domain  vs.  the  same  in  spatial 
domain.  Such  challenges  stem  from  the  fact  that: 

Given,  / (x,  y)  < — >  F(u,  v) ,  then: 

/(xcos<9  +  ysin  0,  -xsir\6  +  y  cos  0)< - >  F(u  cos  0-  vsin  0,  u  sin  Ov  cos  0) 

So,  the  rotation  in  the  spatial  domain  is  also  rotation  in  the  frequency 
domain  but  in  the  opposite  direction  as  shown  in  Figure  4-8. 
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Figure  4-8:  Rotation  estimation,  (a)  Frequency  values  of  a  reference  image,  (b) 

Frequency  values  of  the  rotated  image  (0=25  degrees)  [37] 


So,  using  this  method  will  not  simplify  the  rotation  estimation.  Same  thing  is 
applied  in  the  case  of  scaling  based  on  the  fact  that  scale  change  in  the 
spatial  domain  is  also  scale  change  in  the  frequency  domain.  This  can  be 


expressed  as  follows:  Given  f(x)< — >F(w) ,  then  f(ax)<r 


1  „l 

f 

nF\ 

- 

rl 

K<*J 

4.4  Affine-Motion  Inversion  Scheme  for  Jitter  Detection 

In  this  section,  a  new  approach  to  detect  jitters  in  a  sequence  of  video 
frames  is  introduced.  The  approach  seeks  to  model  the  underlying  changes  a 
series  of  2D  affine  transforms  between  consecutive  video  frames,  without 
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resorting  to  a  three  dimensional  interpretation  of  the  physical  factors  that 
give  a  rise  to  the  changes. 

The  affine  transformation  is  represented  by  a  set  of  invariants  to  be 
estimated  by  weighted  geometric  moments  of  each  observed  image.  In 
particular,  the  image-plane  will  be  viewed  as  a  collection  of  non-overlapping 
concentric  regions  of  varying  weights  of  interest.  Thus,  the  moments  will  be 
calculated  using  a  geometric-location  weighted  and  intensity  weighted 
computations. 

The  approach  proposed  is  a  simplified  strategy  to  decide  if  the 
disparity  between  a  video  frame  and  its  predecessor  is  due  to  a  smooth 
motion  or  an  erratic  jitter.  Six  moments-based  descriptors  and  the  gray  level 
histogram  are  used  to  arrive  at  that  decision.  Individual  parameters  used  for 
such  decision  include:  the  change  in  direction  in  the  apparent  motion  of  the 
weighted  center  of  gravity,  the  discontinuities  in  the  angular  velocity  of  the 
eigen-vectors  of  the  scatter  matrix  (second  order  moments),  and  the 
dynamics  of  the  focus-of-expansion  of  the  observed  ego-centric  optical  flow 
field.  These  computations  are  progressively  complex.  They  will  be 
implemented  at  different  temporal  sampling  rate,  i.e.,  the  simplest  method 
will  be  applied  to  every  frame  while  the  advanced  method  will  be  applied  to 
every  other  frame,  etc. 
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The  basic  method  computes  six  numbers:  M()(h  Mou  M10,  Mlh  M2q,  and 


M()2.  Which  are  defined  as: 


Mu  ='ZxiyJw(x’y)f(x,y) 

x,y 


(4-37) 


The  image  f(x,y)  and  the  weights  w(x,y)  are  expressed  on  a  640x480 
grid  to  be  indexed  as  -320  <  x  <320,  and  -240  <y  <240,  reflecting  a  zero- 
centered  image  plane.  A  standard  Gaussian  function  is  used  for  w(x,y)  with 
an  arbitrary  chosen  half-power  radius  of  128.  Then,  w  is  computed  as: 


w(x,y)  = 


2  2 
X  +y 


2az 


a  =  128 


(4-38) 


The  centroid  of  an  image  is  generally  the  simplest  descriptor  of  the 
image,  which  is  depicted  by: 

(Jux,Juy)  =  (Mw,M01),  (4-39) 

Or 

h’^)  =  (a/a  +  K  >  arctan2(//y ,  ) ) ;  ■ -n  <  0M  <  n.  ( 4.40) 

The  gravity  adjusted  second  order  moments,  M20,  M02  and  M u  are 
defined  as: 


2  _  v* )'  (y  -  Vy  y  w(x’  y)f(x’  y) 

M,y=— - 

^w(x,y)f(x,y) 

x,y 


(4-41) 


It  is  useful  to  represent  them  in  the  form: 
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\-n  <  0V02  <  tc\  and,  A1>A2. 


'm,„ 

^cos#, 

cos 

0^ 

^cos(?j  sin#,^ 

KMn  M22  j 

v  sin  6? 

sin  62 

1° 

^cos^  sin#2y 

(4-42) 

The  factorization  given  above  represents  the  scatter  composition  of 
the  spatial  data.  The  eigen-values  and  the  eigen-vectors  describe  the 
underlying  shape  more  succinctly.  Thus,  each  frame  f(x,y;t)  represented  by 
a  vector,  <t>(/)  =  (r; ,  6u ,  1, ,  12  ,6X,62\  t)T ,  in  addition  to  the  standard  and  weighted 
histograms:  h[g],  and  hw[g],  respectively. 

In  general  a  smoothly  varying  image  sequence  must  entail  smooth 
variations  in  these  parameters.  For  example,  a  constant  motion  of  the 
aircraft  above  an  otherwise  static  landscape  would  entail  gradual  variation  in 
(ru,0u)  indicating  a  constant  velocity  or  constant  acceleration.  This  can  be 

verified  by  computing  the  first  and  second  derivatives  (with  respect  to  time) 
of  these  spatio-temporal  parameters, etc.  Jitters  are  indicated  by 

sudden  discontinuities  in  velocities  due  to  impulsive  or  transient  forces.  The 
simplest  approach  to  detecting  abrupt  discontinuities  in  a  function  is  to  apply 
a  derivative  operator,  and  decide  positive  if  the  output  magnitude  is  above  a 
certain  threshold.  The  threshold  may  have  to  be  determined  adaptively. 

Presence  of  high  frequency  signals  and  noisy  data  pose  serious 
challenge  to  this  approach.  Thus,  it  is  useful  to  preprocess  the  data  to  cancel 
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the  effect  of  noise.  Then,  we  run  the  risk  of  de-emphasizing  a  valid  edge 
data,  due  to  the  low  pass  filtering  nature  of  such  preprocessing  steps.  We 
have  chosen  to  apply  a  robust  multiresolution  technique,  called  Laplacian  of 
Gaussian,  also  know  as  V2  *  G . 

It  is  essentially  a  finite  impulse  response  digital  filter,  whose 
continuous  (analog)  impulse  response  is  of  the  form: 


h(x,  <t) 


2 


S|S  Q  2(7 


(4-43) 


The  function  looks  like  an  inverted  Mexican  hat  as  shown  in  Figure  4-9. 


Figure  4-9:  Inverted  Mexican  hat  signal 
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We  compute: 


g(n)  =  h(n)®f(n ) 

S{n)  =  sign{g{n )) 

E{n)  =  S{n)XORS{n-\) 


where,  g(n)  is  the  generated  signal  used  for  detection,  h(n)  is  the  digital  filter 
whose  equation  is  ( 4-43),  S(n)  is  a  sign  function  which  can  be  expressed  as 
follows: 


Co 


sign(x)  =  \ 


1 


x  <  0 


x  >  0 


(  4-44) 


E(n)= 1,  whenever  there  is  an  edge.  The  procedure  must  be  repeated  at 
least  for  six  a,  i.e.,  h(x,  oq),  h(x,  (Tor),  h(x,  a  or  ),  ...etc.  where  r=  1 . 1 .  The 
reason  why  we  have  at  least  six  a ’s,  is  to  accommodate  wide  range  of 
variations  in  the  imaging  conditions.  The  typical  values  of  h(x)  for  various 
scales  and  resolutions  are  shown  in  Table  4-3  and  plotted  in  Figure  4-10. 
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Table  4-3:  The  digital  filter  used  in  the  jitter  detection  process 


Mask 

Mask  values 

hi 

-0.8094 

-0.10289 

0.3861 

0.112726 

0.008549 

0.000209 

0 

0 

h2 

-0.6849 

-0.32967 

0.24046 

0.292063 

0.114655 

0.02239 

0.002402 

0.000147 

h3 

-0.72272 

-0.27137 

0.317401 

0.246125 

0.06186 

0.006962 

0.000382 

0 

h4 

-0.64948 

-0.37242 

0.150245 

0.306644 

0.175519 

0.05371 

0.009892 

0.001147 

h5 

-0.74284 

-0.23582 

0.347275 

0.214785 

0.04177 

0.003404 

0 

0 

h6 

-0.68486 

-0.32973 

0.240357 

0.292101 

0.114727 

0.022418 

0.002406 

0.000148 
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Figure  4-10:  The  digital  filter  plot  used  in  the  jitter  detection  process 


If  a  zero-crossing  is  found  in  at  least  three  a ’s,  this  gives  an  indication  of 
existence  of  jitter.  The  next  zero-crossing  of  these  o’s  indicates  the  end  of 
the  jitter.  All  frames  between  these  two  consecutive  zero-crossing  instances 
are  to  be  dropped  and  replaced  by  a  suitable  postprocessed  images.  We 
choose  a  simple  method  which  take  the  first  and  the  last  images  in  the 
sequence,  to  generate  a  smoothly  varying  image  sequence.  This  image 
sequence  is  computed  by  the  FFT  estimator  described  in  section  4.3. 
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V.  Results  and  Analysis 


This  chapter  describes  the  experimental  setup  and  the  results  obtained 
in  testing  all  algorithms  developed  in  this  thesis. 

The  results  of  the  affine  model  fitting  algorithm  to  estimate  motion 
vectors  will  be  discussed  first.  The  second  section  describes  the  test  results 
for  the  frequency  domain  method  to  estimate  the  inter-frame  translation  in 
an  image  sequence.  Finally,  the  effectiveness  of  jitter  detection  technique 
will  be  demonstrated. 

5. 1  Affine  Based  Approach  for  Motion  Estimation 

This  approach  was  tested  in  a  simulation  setup  and  a  practical 
experiment.  The  goal  of  the  simulation  was  to  evaluate  the  performance  in  a 
controlled  environment  with  exact  knowledge  about  the  shift  and  rotation 
values  between  two  images  in  an  image  sequence.  This  enables  us  to 
evaluate  the  performance  and  sensitivity  of  the  algorithm.  We  have  also 
tested  the  algorithm  on  a  real  world  imagery  without  any  modifications 

5.1.1  Simulation 

The  scene  we  used  in  the  simulation  was  the  same  scene  used  in  the 
experimental  section.  However,  the  principale  difference  stems  from  the  way 


71 


we  preprocessed  the  image  frames.  For  the  simulated  testing,  each  image 
was  padded  with  a  black  (all  zeros)  pixels  background.  In  contrast,  the 
realistic  testing  considers  the  image  data  without  adding  a  well  defined 
boundary  conditions. 

For  the  simulation,  we  started  from  an  image  shown  in  Figure  5-1.  It 
is  of  size  700  x  400  pixels  with  a  zero-padded  background  making  the  total 
size  of  image  720  x  480  pixels. 


Figure  5-1:  Reference  image 

Two  images  were  constructed  by  shifting  and  rotating  the  original 
image.  These  two  images  were  then  used  as  inputs  to  the  algorithm  to 
estimate  the  motion  vectors  between  each  image  and  the  original  image.  The 
first  image  shown  in  Figure  5-2  is  constructed  by  shifting  the  original  image 
by  19  pixels  in  the  positive  x-axis  direction. 
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Figure  5-2:  The  constructed  image  by  shifting  the  original  image  by  19  pixels 


When  only  shifts  are  applied,  the  motion  vectors  estimation  produces 
perfect  results.  Table  5-1  shows  the  estimated  vectors. 


Table  5-1:  Estimated  Motion  Parameters  for  Pure  Image  Translation 


Parameter 

Value 

Actual 

Estimated 

/V 

crx 

1 

1 

/V 

C  y 

1 

1 

/V 

e 

0 

0 

/V 

a 

0 

0 

Af 

19 

19 

Ay 

0 

0 
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The  second  image  shown  in  Figure  5-3  is  constructed  by  rotating  the 
original  image  by  0  =  0.0524  radians.  The  rotation  was  centered  at  the  center 
of  the  image  grid. 


Figure  5-3:  The  constructed  image  by  rotating  the  original  image  by  0.0524  radians 

When  rotation  is  applied,  as  shown  in  Figure  5-3,  the  rotation  angle  is 
accurately  estimated  but  with  not  well  estimated  shifts.  Table  5-2  shows  the 
actual  parameters  along  with  the  estimated  ones. 
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Table  5-2: 


Estimated  Motion  Parameters  for  Pure  Image  Rotation 


Parameter 

Value 

Actual 

Estimated 

/V 

CTx 

1 

0.95496 

/V 

(7y 

1 

1.0236 

/V 

e 

0.0524 

0.0577 

/v 

a 

0 

0.0126 

<i 

0 

31.662 

Ay 

0 

-18.287 

The  errors  in  the  shift  estimation  are  due  to  the  interpolation 
approximation  made  when  rotating  the  image. 

Finally,  the  original  image  in  Figure  5-1  and  the  two  estimated  motion 
vectors  in  Table  5-1  and  Table  5-2  are  used  as  inputs  to  the  reconstruction 
part  of  the  algorithm.  Figure  5-4  and  Figure  5-5  show  the  reconstructed 
images  from  the  estimated  motion  parameters  in  Table  5-1  and  Table  5-2 
respectively. 


75 


Figure  5-4:  Reconstructed  image  from  Table  5-1 


Figure  5-5:  Reconstructed  image  from  Table  5-2 
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The  precision  of  the  image  in  Figure  5-4  was  sufficient  to  have  a  good 
reconstruction.  However,  the  image  in  Figure  5-5  is  in  acceptable  precision 
except  for  the  shifts. 

5.1.2  Experiment 

As  described  in  the  last  section,  the  images  used  in  the  experimental 
testing  are  a  real  world  images  without  any  preprocessing.  Two  images 
taken  at  time=?  and  time=t+7  were  to  be  considered.  These  two  images  are 
shown  in  Figure  5-6  and  Figure  5-7  respectively. 
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Figure  5-7:  Image  taken  at  time  =  t+1 


Because  of  the  small  amount  of  motion  between  the  two  consecutive 
images,  the  non-overlapping  part  between  them  is  small.  Figure  5-8  shows 
an  inverted  version  of  the  difference  between  the  two  images. 
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Figure  5-8:  Difference  between  image  at  time=  t  and  image  at  tim=  t+1 


Unlike  the  simulation  section,  we  will  have  an  inside  and  more  deep 
look  at  the  algorithm  in  this  section.  Firstly,  we  will  start  by  plotting  the 
histogram  and  the  cumulative  histogram  of  the  image  at  time=t  which  is  the 
start  point  of  the  segmentation  process.  The  histogram  and  the  cumulative 
histogram  are  shown  in  Figure  5-9  and  Figure  5-10  respectively.  Figure  5-10 
also  shows  the  gray  levels  of  the  six  segmented  subimages  Bi  to  B<$.  Figure 
5-11  shows  these  subimages. 
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Figure  5-9:  Image  Histogram 
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Figure  5-10:  Cumulative  Histogram 
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Subimage  B3 


Subimage  B4 


Subimage  B5  Subimage  B6 

Figure  5-11:  Binary  subimages  resulted  from  segmentation  process 
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After  extracting  the  binary  subimages,  the  correlation  of  rotation  angle 
6  and  deformation  angle  a  is  plotted  to  determine  the  proper  group  of 
subimages  that  should  be  used  to  calculate  the  motion  vectors.  Figure  5-12 
shows  rotation-deformation  angles  plot. 


By  using  the  approach  described  in  section  3.1.6,  we  can  find  the  best 
set  of  subimages  which  can  be  used  to  calculate  the  motion  vectors.  Table 
5-3  shows  the  final  estimated  motion  vectors. 
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Table  5-3:  Estimated  motion  vectors 


Parameter 

Value 

/V 

1.0009 

CTx 

/V 

c>y 

1.0145 

/V 

e 

0.014805 

/V 

a 

0.003251 

Ax 

-1.7065 

< 

-11.281 

At  this  point,  we  can  use  the  estimated  motion  vectors  to  reconstruct 
the  image  at  tim e=t+l.  Figure  5-13  show  the  reconstructed  image. 


Figure  5-13:  Reconstructed  image  at  time=t+l 


The  original  image  at  time=t+l  in  Figure  5-7  and  the  reconstructed 
image  in  Figure  5-13  shows  some  differences.  Figure  5-14  shows  these 
differences. 
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Figure  5-14:  Difference  between  image  at  time=t+7  and  the  constructed  image 


In  order  to  get  more  precise  motion  vector  estimates,  the  difference 
between  the  image  at  time=t+l  and  the  constructed  image  must  be 
minimized. 


85 


5.2  Frequency  Domain  Approach  to  Estimate  Image  Translation 

Following  the  same  scheme  used  in  the  previous  section  to  evaluate 
an  algorithm,  two  types  of  test  are  conducted.  We  will  have  a  simulation  and 
an  experiment  for  the  same  reason  mentioned  previously. 

5.2.1  Simulation 

For  the  part  of  simulation,  we  used  the  image  shown  in  Figure  5-15  as 
an  original  image.  This  image  is  of  size  620  x  400  pixels  with  a  zero-padded 
background,  making  the  total  size  of  720  x  480  pixels.  A  new  image  has 
been  constructed  by  applying  image  translation  effect  to  the  original  image. 
Figure  5-16  shows  this  constructed  image. 


Figure  5-15:  Original  image 
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Figure  5-16:  Constructed  image  by  shifting  the  original  image 

These  two  images  are  used  as  inputs  to  the  FFT  motion  estimator  to 
estimate  the  motion  translation  between  the  constructed  image  and  the 
original  image.  The  FFT  estimator  gave  perfect  results  during  the  process  of 
simulation.  It  could  produce  the  exact  values  of  translations.  Table  5-4 
shows  this  result. 


Table  5-4:  Estimated  motion  translation  of  the  simulated  images 


Parameter 

Value 

Actual 

Estimated 

Ax 

-15 

-15 

at 

-40 

-40 
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5.2.2  Experiment 


As  in  the  previous  section,  in  this  part  of  FFT  estimator’s  evaluation,  a 
more  realistic  sequence  of  images  is  used  as  inputs.  Figure  5-17  and  Figure 
5-18  show  two  images  that  are  taken  from  a  video  stream. 

The  translation  between  these  two  images  is  to  be  estimated  using 
FFT  estimator  developed.  It  has  been  assumed  that  the  rotation  and  scaling 
factors  between  these  two  images  are  very  small  and  hence  can  be  neglected. 
In  this  case,  the  image  at  time=t+7  can  be  totally  reconstructed  using  the 
estimated  translations. 


Figure  5-17:  Image  at  time=t 
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Figure  5-18:  Image  at  time=t+7 


We  have  seen  that  the  shift  parameter  Ax  and  Ay  can  be  computed  as 
the  slope  of  the  phase  difference  between  the  two  images.  The  first  step  here 
is  to  plot  the  phase  difference  of  the  two  images.  Figure  5-19  shows  such 
plot. 
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Figure  5-19:  Phase  difference  of  images  at  time=t  and  time=f+/ 


The  parameter  Ax  is  the  shift  in  ^-direction.  It  is  computed  by 
eliminating  the  effect  of  shift  in  v-direction.  This  is  done  by  plotting  the 
phase  difference  along  w-axis  and  setting  v=0  as  in  Figure  5-20.  Same 
method  is  used  to  get  the  parameter  Ay. 
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Figure  5-21:  Phase  difference  in  v-axis 

91 


As  mentioned  earlier,  the  slope  of  phase  differences  in  each  axis 
represent  the  estimated  shift  parameters.  Table  5-5  shows  the  final  results. 

Table  5-5:  FFT  estimated  motion  translation  of  the  simulated  images 


Parameter 

Value 

Ax 

-27 

at 

13 

In  order  to  evaluate  the  results  obtained,  a  new  image  can  be 
constructed  from  image  at  time=?  and  the  estimated  motion  translation  then 
compared  to  the  image  at  time=?+7.  Figure  5-22  shows  the  constructed 
image. 


Figure  5-22: 


Reconstructed  image  from  translation  estimated  values 
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Figure  5-23  shows  the  error  in  difference  between  the  constructed 
image  and  the  image  at  time=t+l.  This  difference  has  been  inverted  for 
better  viewing. 


Figure  5-23:  The  difference  between  the  constructed  image  and  image  at  time  =  t+1 


We  can  notice  that  the  estimated  values  are  precise  enough  to  produce 
a  good  constructed  image. 

Only  translation  was  considered.  The  other  motion  vectors  like 
rotation  and  scaling  which  are  to  be  investigated  in  a  future  work  using  the 
Fourier  Domain  approach. 
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5.3  Jitter  Detection  Algorithm 


This  section  presents  experimental  results  obtained  from  a  video 


sequence.  The  image  frame  sequence  is  of  size  320  x  240  pixels  and 


contains  336  frames  in  length.  The  frame  rate  is  28  frames  per  second. 


The  video  is  used  as  input  to  the  developed  jitter  detection  algorithm. 


The  Motion  parameter  0  values  of  the  sequence  are  depicted  in  Figure  5-24. 
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Figure  5-24:  The  motion  parameter  0  of  the  frames  sequence 
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The  output  of  convolving  this  signal  with  the  multi-resolution  LoG 


filters  shown  in  Figure  5-25. 


Figure  5-25:  Convolution  output  of  6  and 

It  shows  the  existence  of  two  jitters.  The  first  jitter  starts  at  frame  101 
and  lasts  for  26  frames.  The  second  jitter  starts  at  frame  218  and  lasts  for  24 
frames.  Unfortunately,  still  images  are  not  the  proper  way  to  display 
dynamic  process  like  video  stabilization.  But,  the  result  can  be  shown  as  a 
sequence  of  still  images.  Figure  5-26  and  Figure  5-27  show  the  frame 
images  of  these  two  video  jitters. 
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Figure  5-26:  The  frames  of  the  first  jitter 
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Figure  5-27:  The  frames  of  the  second  jitter 
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Frame  101  must  be  registered  with  frame  126.  Same  thing  should  be 
done  to  frame  218  and  frame  242.  Either  image  registration  algorithms 
developed  in  this  thesis  can  be  used  to  do  this  task.  FFT  approach  has  been 
selected  to  do  such  job.  Table  5-6  shows  the  resulted  estimated  motion 
parameters. 


Table  5-6:  Estimated  Motion  Vectors  for  the  two  jitters 


Parameter 

Estimated 

Value 

Ax 

-38 

Ay 

17 

Frame  101  to  Frame  127 

Parameter 

Estimated 

Value 

Ax 

-26 

Ay 

11 

Frame  218  to  Frame  242 

The  next  step  now  is  to  reconstruct  images  according  to  the  estimated 
motion  vectors.  We  can  note  that  only  the  image  translations  were  estimated, 
that  is  because  the  FFT  motion  estimator  is  selected  to  estimate  the  motion 
vectors.  This  is  acceptable  because  of  the  nature  of  the  dataset  under  testing. 
Dropped  frames  were  substituted  by  a  reconstructed  version  of  the  last  frame 
in  the  jitter.  To  give  impression  of  a  smooth  transition  between  frames,  the 
amount  of  translation  in  these  frames  were  gradually  increased  until  last 
frame  of  the  jitter  reached.  Figure  5-28  and  Figure  5-29  show  the  final 
results. 
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Figure  5-28:  The  frames  of  the  first  jitter  after  stabilization 
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Frame  223 


Frame  224 


Frame  225 
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Figure  5-29:  The  frames  of  the  second  jitter  after  stabilization 
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As  mentioned  earlier  in  describing  the  jitter  detection  technique 
developed  in  this  thesis,  the  missing  part  of  the  images  resulted  from  images 
reconstruction,  should  be  substituted  from  future  frames.  This  should  not  be 
very  difficult  and  it  is  recommended  to  be  done  as  a  future  work  for  this 
thesis. 
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VI.  Conclusion 


6.1  Summary 

In  this  thesis,  we  have  presented  a  video  stabilization  technique.  The 
underlying  technology  of  motion-estimation,  jitter  detection,  and  image 
registration,  have  been  described.  We  presented  the  formulation  we  used  to 
implement  real-world  video  stabilization  algorithms  and  the  results  obtained 
with  these  algorithms.  We  also  presented  the  required  analysis  to  fully 
develop  the  approach.  An  Affine-based  approach  that  tracks  a  small  set  of 
features  was  used  to  estimate  the  movement  of  the  camera  between 
consecutive  frames.  Fourier  transform  was  also  used  to  demonstrate 
translation  estimation  between  images.  The  resulting  translation  estimate 
was  robust  and  fast. 

6.2  Limitations 

The  displayed  frames  are  always  delayed  by  several  units  of  time.  In 
general,  up  to  five  frames  of  delay  is  both  adequate  and  acceptable.  In  a 
realistic  video  acquired  for  30  frames/second,  this  delay  amount  to  1/6  of  a 


102 


second.  Research  studies  on  human  perception  of  images  indicate  that  this 
delay  is  not  of  adverse  impact  for  routine  tasks. 

When  the  scene  is  comprised  of  ground  moving  targets,  interpolation 
based  on  the  first  and  the  last  frame  as  used  here,  will  not  be  adequate.  A 
more  complex  technique  will  have  to  be  developed.  It  is  left  for  future 
development  of  this  effort. 

6.3  Additional  Remarks 

We  have  applied  an  expanded  version  of  the  jitter  detection  and 
compensation  on  a  real  world  UAV  data.  The  dataset  and  the  experimental 
findings  are  not  included  in  this  document  for  logistic  reasons. 

6.4  Future  work 

Current  work  from  this  thesis  can  be  extended  to  improve  the 
performance  and  reduce  the  constraints  on  camera  motion.  Possible 
improvements  include: 

1.  Extending  the  FFT  estimator  to  estimate  rotation  and  other  motion 
vectors  besides  translation. 

2.  Adding  a  process  to  the  jitter  detector  to  compensate  the  missing  parts 
of  images  which  occur  due  to  image  reconstruction. 
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Appendix  A 


This  appendix  lists  the  Matlab  code  developed  in  this  research. 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 
11 
12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 


aff inmethod . m 


Author:  Capt .  Mohammed  A.  Alharbi 
Date  :  12- January-2006 
Description : 

The  function  is  the  controller  of  the  based  affine 
transformation  method  to  register  two 
images.  It  takes  two  images  and  construct  a  third 
image  %based  on  the  calculated  motion  vectors . 
Usage : 

image 3=aff inmethod ( image, image 2 ) 

Input : 

-  the  first  RGB  image  at  time=t 

-  the  second  RGB  image  at  time=t+l 


image 1 
image2 
Output : 
image3 


the  reconstructed  image  based  on  the 
estimated  motion  vectors 


function  [ image 3 ] =af f inemthod ( image 1 , image 2 ) 
clear  all; 
clc; 

[ theta,  alpha,  sigmax,  sigmay,  cl,  c2,  sizesl,  sizes2]  =... 
main (imagel, image2) ; 

E=FullTree (theta, alpha) ; 
nMST=MinSpanTree (E) ; 
dMAX=max (E ( :  ,  3)  )  ; 

MST=E (nMST, : ) ; 
z=MST ( : , 3) <=dMAX*0 . 1; 

Clusters=MST (z, : ) ; 

F=f orests (Clusters ( :  ,  1:2)  )  ; 
plotter (theta, alpha, Clusters)  ; 

[properCluster , SubimagesNumbers ] =f indCluster2 (Clusters, E) ; 
[alphabar,  thetabar,  sigmaxbar,  sigmaybar,  clbar,  c2bar]  =... 
calcAve rages  (properCluster,  sizesl ,  alpha,  theta, ... 
sigmax, sigmay, cl , c2 ) 
end 


main . m 

on  9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9--9--9-- 

O  I  ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

38  %  Description: 

39  %  This  function  takes  two  images  as  inputs  and  calculated 
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40 

o 

o 

the  motion  vectors  between  them 

41 

o 

o 

Usage : 

42 

o 

o 

[Q1,Q2,Q3,Q4,Q5,Q6, sizesl, sizes2]  =  main ( image , image2 ) 

43 

o 

o 

Input : 

44 

o 

o 

imagel  -  the  first  RGB  image  at  time=t 

45 

o 

o 

image2  -  the  second  RGB  image  at  time=t+l 

46 

o 

o 

Output : 

47 

o 

o 

Q1  -  Rotation  angle  (theta) 

48 

o 

o 

Q2  -  Deformation  angle  (alpha) 

49 

o 

o 

Q3  -  Scaling  factor  in  x-axis 

50 

o 

o 

Q4  -  Scaling  factor  in  y-axis 

51 

o 

o 

Q5  -  Shift  in  x-axis 

52 

o 

o 

Q6  -  Shift  in  y-axis 

53 

o 

o 

sizesl  -  The  size  of  binary  images  of  the  first  image 

54 

55 

56 

o 

o 

o 

sizes2  -  The  size  of  binary  images  of  the  second  image 

o 

oooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

57 

function  [Q1,Q2,Q3,Q4,Q5,Q6, sizesl, sizes2]  = 

58 

myf un ( iml , im2 ) 

59 

[Ml , sizesl ] =ex tract 8 features ( iml ,  1 )  ; 

60 

[M2 ,  sizes 2 ] =ex tract 8 features ( im2 ,  2 )  ; 

61 

Qalpha= [ ] ; 

62 

Qtheta= [ ] ; 

63 

Qsigmax= [ ] ; 

64 

Qsigmay= [ ] ; 

65 

QC1= [ ] ; 

66 

QC2= [ ] ; 

67 

e=l  ; 

68 

for  i=l : 6 

69 

for  j  =i  +  l :  6 

70 

for  k= j  +1 :  6 

71 

X=  [Ml  (2*i-l)  ,M1  (2* j -1)  ,M1  (2*k-... 

72 

1) ;M1 (2*i) f Ml (2* j ) ,M1 (2*k) ] ; 

73 

Xbar= [M2 (2*i-l) f M2 (2* j -1 ) , M2 (2*k- 

74 

...1)  ;  M2  ( 2  *  i )  ,  M2  ( 2  *  j  )  ,  M2  ( 2  *  k )  ]  ; 

75 

Z=simequ2 (XfXbar) ; 

76 

matrices 

77 

[Q, Qalpha,  Qtheta, Qsigmax, Qsigmay, QC1 ,  QC2 ] =trans ( Z ,  Qalpha 

78 

r 

Qtheta, Qsigmaxf  Qsigmayf  QC1 ,  QC2 ) ; 

79 

end; 

80 

end; 

81 

end; 

82 

Q1  =  Qtheta; 

83 

Q2  =  Qalpha; 

84 

Q3  =  Qsigmax; 

85 

Q4  =  Qsigmay; 

86 

Q5  =  QC1 ; 

87 

Q6  =  QC2 ; 

88 

end 

Extract 6features .m 


89 

o  o 
o  o 

oooooooooooooooooooooooooooooooooooo 

oooooooooo 

ooooooooooo 

90 

o 

o 

Description : 

91 

o 

o 

This  function  extracts  six  binary 

subimages 

out  of  an 

92 

o 

o 

input  image.  Then,  it  outputs  the 

sizes  and 

the 

93 

o 

o 

centroid  of  these  binary  images. 
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94 

95 

96 

97 

98 

99 

100 
101 
102 

103 

104 

105 

106 

107 

108 

109 

110 
111 
112 

113 

114 

115 

116 

117 

118 

119 

120 
121 
122 

123 

124 

125 

126 

127 

128 

129 

130 

131 

132 

133 

134 

135 

136 

137 

138 

139 

140 

141 

142 

143 

144 

145 

146 

147 

148 

149 

150 

151 


o 

o 


Usage : 

Extract 6features (imagel) 
Input : 

imagel 
Output : 

centroids 
Imsizes 


-  an  RGB  image . 

-  The  centroid  of  the  six  binary  subimages 

-  The  sizes  of  the  six  binary  images. 


function  [ centroids , imsizes ]  =  features ( imfile, imno) 
writeimages=l ; 
showimages=0 ; 
close  all; 

12  =  imread ( imf ile) ; 
if  isrgb ( 12 ) 

I2=rgb2gray ( 12 )  ; 
end 

h  =  imhist ( 12 ) ; 

H=cumsum (h) ; 

[M,N] =size (12) ; 
k=M*N; 
kl=k/6; 
k2=2  *k/ 6 ; 
k3=3*k/ 6 ; 
k4  =  4  *k/ 6 ; 
k5=5*k/6; 
k6=6*k/ 6 ; 

[val, indx] =min (abs (H-kl ) ) ; 
klhat=indx; 

[val, indx] =min (abs (H-k2 ) ) ; 
k2hat=indx; 

[val, indx] =min (abs (H-k3) ) ; 
k3hat=indx; 

[val, indx] =min (abs (H-k4 ) ) ; 
k4hat=indx; 

[val, indx] =min (abs (H-k5) ) ; 
k5hat=indx; 

[val, indx] =min (abs ( H— k  6 ) ) ; 
k6hat=indx; 

%  The  first  binary  image 
B i~ 12; 

Bl=double (12<klhat) ; 

Bl=logical (Bl) ; 

B1=-B1; 

imsizes ( 1 ) =sum ( sum (Bl ) ) ; 

%  The  second  binary  image 
B2=double ( 12>klhat &12<k2hat ) ; 

B2=logical (B2) ; 

B2=~B2 ; 

imsizes ( 2 ) =sum ( sum (B2 ) ) ; 

%  The  third  binary  image 
B3=double ( 12>k2hat &12<k3hat ) ; 

B3=logical (B3) ; 

B3=~B3 ; 

imsizes ( 3 ) =sum ( sum (B3 ) ) ; 

%  The  fourth  binary  image 
B4=double ( 12>k3hat &12<k4hat ) ; 
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152 

153 

154 

155 

156 

157 

158 

159 

160 
161 
162 

163 

164 

165 

166 

167 

168 

169 

170 

171 
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173 

174 

175 

176 

177 

178 

179 

180 
181 
182 

183 

184 

185 

186 

187 

188 

189 

190 

191 

192 

193 

194 

195 

196 

197 

198 

199 

200 
201 
202 

203 

204 

205 

206 

207 

208 
209 


B4=logical  (B4)  ; 

B4=~B4 ; 

imsizes ( 4 ) =sum ( sum (B4 ) ) ; 

%  The  fifth  binary  image 
B5=double ( I2>k4hat &l2<k5hat ) 
B5=logical (B5) ; 

B5=~B5 ; 

imsizes (5) =sum (sum (B5) ) ; 

%  The  sixth  binary  image 
B6=double ( I2>k5hat &l2<k6hat ) 
B6=logical (B6) ; 

B6=~B6 ; 

imsizes (6) =sum (sum (B6) ) ; 
if  imno==l 

f ilenamel= ' images\Bll . bmp  1 ; 
f ilename2= ' images\B21 . bmp  1 ; 
filename 3= 1  images \B31 . bmp  1 ; 
filename 4= 1  images \B 4 1 . bmp  1 ; 
filename 5= ' images \B51 . bmp  1 ; 
f ilename6= ' images\B61 . bmp  1 ; 
else 

f ilenamel= ' images \B 12 . bmp  1 ; 
f ilename2= 1  images \B2 2 . bmp  1 ; 
filename 3= 1  images \B3 2 . bmp  1 ; 
f ilename4=  f images\B42 . bmp  1 ; 
filename 5=  f images \B5 2 . bmp ' ; 
filename 6= 1  images \B 6 2 . bmp  1 ; 
end; 

if  writeimages 
imwrite (Bl, filenamel) ; 
imwrite (B2, filename2) ; 
imwrite (B3, filename3) ; 
imwrite (B4, filename4) ; 
imwrite (B5 , filename5) ; 
imwrite (B6, filename6) ; 
end; 

a=centroid ( filenamel ) ; 

MM (1,1) =a ( 1 )  ; 

MM (2, 1) =a  (2)  ; 
a=centroid ( f ilename2 ) ; 

MM (1,2) =a ( 1 ) ; 

MM  (2, 2)  =a  (2)  ; 
a=centroid ( f ilename3 ) ; 

MM ( 1 , 3 ) =a ( 1 )  ; 

MM (2 , 3) =a (2 ) ; 
a=centroid ( f ilename4 ) ; 

MM (1,4) =a ( 1 ) ; 

MM  (2, 4)  =a  (2)  ; 
a=centroid ( f ilename5 ) ; 

MM (1,5) =a ( 1 )  ; 

MM  (2,5)  =a  (2  )  ; 
a=centroid ( f ilename6 ) ; 

MM  (1,6)  =a  ( 1 )  ; 

MM (2 , 6) =a  (2)  ; 
if  showimages 
figure ; 

imshow(Bl) ; figure; 
imshow(B2) ; figure; 
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210 

imshow (B3) ; figure; 

211 

imshow(B4) /figure; 

212 

imshow (B5) ; figure; 

213 

imshow ( B  6 ) ; 

214 

end; 

215 

centroids  =  MM; 

216 

end 

centroid . m 

217 

ooooooooooooooooooooo 

OOOOOOOOOOO' 

218 

%  Description: 

219 

%  This  function  calculated  the 

220 

%  image . 

221 

%  Usage: 

222 

%  B  centroid  =  centroid ( imfile 

223 

%  Input: 

224 

%  imfile  -  The 

file  name  o: 

225 

%  Output : 

226 

%  B  centroid  -  The 

calculated  < 

227 

ooooooooooooooooooooo 

OOOOOOOOOOO' 

228 

229 

function  B  centroid  = 

centro ( imf . 

230 

im  =  imread ( imfile) ; 

231 

[rows, cols]  =  size(im); 

232 

x  =  ones (rows, 1) * [1 : 

cols]  ; 

233 

y  =  [l:rows] '*ones(l 

,  cols 

234 

area  =  sum (sum (im) ) ; 

235 

meanx  =  sum ( sum (double ( im)  . *x) )  , 

236 

meany  =  sum ( sum (double ( im)  . *y)  )  . 

237 

B  centroid  =  [meanx. 

meany]  ; 

238 

end 

a  binary 
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simeq2 . m 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  Description: 

%  This  function  solves  three  simultaneous  equations. 

%  Usage: 

%  result  =  simeq2 (col ,  co2 ) 

%  Input: 

%  col  -  Coefficients  of  the  first  set  of  equations 

%  co2  -  Coefficients  of  the  second  set  of  equations 

%  Output : 

%  result  -  The  result  of  the  solved  equations 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  result=myf unc (col ,  co2 ) 
xll=col (1,1) ; 
xl2=col (1,2) ; 
xl3=col (1,3)  ; 
yll=col (2,1)  ; 
yl2=col (2,2)  ; 
yl3=col (2,3) ; 
xllbar=co2 (1, 1) ; 
xl2bar=co2 (1,2)  ; 
xl3bar=co2 (1, 3)  ; 
yllbar=co2 (2, 1) ; 
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262  yl2bar=co2  (2 , 2 )  ; 

263  yl3bar=co2  (2 , 3) ; 

264  XX  =[xll  yll  0010; 

265  0  0  xll  yll  0  1; 

266  xl2  yl2  0  0  1  0; 

267  0  0  xl2  yl2  0  1; 

268  xl3  yl3  0010; 

269  0  0  xl3  yl3  0  1] ; 

270  b=[xllbar  ;yllbar  ;xl2bar;  yl2bar  ;xl3bar  ;yl3bar  ]; 

271  result  =  XX\b; 

272  end 
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294 
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trans . m 


Description : 

This  function  calculates  the  values  of  the  scaling  in 
x-axis  and  y-axis.  Also,  it  calculates  the  values  of 
the  rotation  angle  and  the  deformation  angle.  Then,  i 
reorganizes  shift  values  and  put  them  in  matrices.  It 
produces  all  vectors  of  motion  as  arrays. 

Usage : 

[Qalphas , Qthetas , Qsigmaxs , Qsigmays , QCls , QC2s ]  = 
trans (M, Qalpha, Qtheta, Qsigmax, Qsigmay, QC1 , QC2 ) 

Input : 

Qalpha  -  Deformation  angle. 

Qtheta  -  Rotation  angle. 

Qsigmax  -  Scaling  factor  in  x-axis 
Qsigmay  -  Scaling  factor  in  y-axis 

-  Shift  in  x-axis 

-  Shift  in  y-axis 


QC1 

QC2 

Output : 

Qalphas 

Qthetas 


-  Array  of  deformation  angles. 

-  Array  of  rotation  angles. 

Qsigmaxs  -  Array  of  Scaling  factors  in  x-axis 

Qsigmays  -  Array  of  Scaling  factors  in  y-axis 

QCls  -  Array  of  Shifts  in  x-axis 

QC2s  -  Array  of  Shifts  in  y-axis 


function  [Qalphas , Qthetas , Qsigmaxs , Qsigmays , QCls , QC2s ]  = 
my fun (M, Qalpha, Qtheta, Qsigmax, Qsigmay, QC1 , QC2 ) 
sigmax=sqrt (M ( 1 ) +M (2 ) ) ; 
sigmay=sqrt (M (3 ) +M ( 4 ) ) ; 

C1=M ( 5 ) ; 

C2=M ( 6 ) ; 

alphaminustheta=atan2 (M(2)  ,M(1)  )  ; 
alphaplustheta=atan2 (M ( 3 ) ,M(4) ) ; 
alpha= (alphaminustheta+alphaplustheta) /2; 
the ta=alphaplus theta- alpha ; 

Qalphas= [Qalphas  ;  alpha]; 

Qthetas= [Qthetas  ;  theta]; 

Qsigmaxs= [Qsigmaxs ;  sigmax] ; 

Qsigmays= [Qsigmays ;  sigmay] ; 

QCls= [QCls ;  Cl]; 

QC2s= [QC2s ;  C2]; 
end 


Fulltree . m 
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Description : 

This  function  creates 
sets  of  points. 

Usage : 

[Q]=Fulltree (X,  Y) 
Input : 

X 


a  full  tree  graph  of  a  given  two 


Y  - 

Output 


The  x  coordinates  of  the  points  to  be  converted  in 
a  fully  tree  graph. 

The  Y  coordinates  of  the  points. 


Q  -  Edges  of  the  created  full  tree  graph. 


function  [Q]  =  myfun(X,Y) 
z  =  l  ; 

for  i=l : 1 9 

for  j  =i  +  l : 2 0 

Q(z,:)  =  [±  j  sqrt  (  (X  ( i )  -X  ( j  )  )  A2  +  (Y  (i) -Y  ( j  )  )  A2  )  ]  ; 
z  =  z  + 1  ; 
end 
end 
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MinSpanTree .m 


This  function  solves  the  minimal  spanning  tree  problem 
for  a  connected  graph. 

Input  parameter: 

E(m,  2)  or  (m,  3)  -  the  edges  of  graph  and  their  weight; 

1st  and  2nd  elements  of  each  row  is  numbers  of 
vertexes ; 

3rd  elements  of  each  row  is  weight  of  edge; 
m  -  number  of  edges. 

If  we  set  the  array  E(m,  2),  then  all  weights  is  1. 
Output  parameter: 

nMST  -  the  list  of  the  numbers  of  edges  included 
in  the  minimal  (weighted)  spanning  tree  in  the 
Including  order. 

Uses  the  greedy  algorithm. 

Author:  Sergiy  Iglin 
e-mail:  iglin@kpi.kharkov.ua 
or:  siglin@yandex.ru 

personal  page:  http://iglin.exponenta.ru 


function  nMST=MinSpanTree (E) 

[m,n,E]  =  Evalidation (E) ; 

En={ ( 1 : m) '  ,E]  ; 

[Emin, nMST] =min (En ( : , 4) ) ;  nVer= [En (nMST, 2 : 3 ) ] ; 
En=En (setdiff ( [1 :m] , nMST) , : ) ; 
while  length (nVer) <n, 

Encurr= [ ] ; 
for  k=l : size (En,  1)  , 
if  sum ( ismember (En ( k, 2 : 3 ) , nVer ) ) ==1 , 

Encurr= [Encurr ; En ( k,  : )  ]  ; 
end 
end 

if  isempty (Encurr )  , 
error (' The  graph  is  not  connected!') 


no 


372  end 

37  3  [Emin,  imin] =min (Encurr  (  : , 4 ) ) ; 

374  nEdge=Encurr ( imin, 1 ) ; 

375  nMST= [nMST, nEdge] ; 

376  nVer=unique ( [nVer , Encurr ( imin, 2:3)]); 

37  7  En=En ( setdif f ( [ 1 : size (En, 1 ) ] , find (En ( : , 1 ) ==nEdge) )  ,  : ) ; 

378  end 

379  return 

380  end 
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forests . m 

9'9'9'9'9'9'9'9'2'9'9'9'9'9'9'9'2'9'9'9'9'9'9'9'2'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'2'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9' 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  Description:  This  algorithm  builds  forests  out  of  a  group 
%  of  nodes  using  Kruska's  algorithm. 

o 

o 

%  Usage: 

%  [Q] =f orests (A) . 

%  Input: 

%  A  -  A  group  on  nodes . 

%  Output : 

%  Q  -  A  graph  consist  of  forest. 

9'9'9'9'9'9'9'9'2'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'9'2'9'9'9'9'9'9'9'9'9'2'9'9'9'9'9'9'9'9'9'9' 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

function  [Q]=myfun(A) 

B=  [  ]  ; 

k=max  (A  (  :  )  )  ; 

M  =  sparse (A ( : , 1 ) , A ( : , 2 ) , 1 , k, k) ; 

M=M+M 1  ; 

Ml  =  zeros (size (M) ) ; 
flag  =  1; 
while  flag 

Ml=double ( (M+M*M) ~=0) ; 
if  isequal (M, Ml ) 
flag  =  0; 
end 
M=M1  ; 
end 

M=unique (M,  1  rows ' )  ; 

M  (all  (~M,  2 )  ,  :)  =  []; 
for  (i=l : size (M, 1) ) 
a=f ind (M (i,  : ) )  ; 
for  ( j =1 : size (A, 1 ) ) 
if  (length (intersect (a, A (j ,:)) )  >  0) 

B=[B;  A ( j , : ) ] ; 
end 
end 

B= [B ;  [0  0] ]  ; 

end 
Q=B; 


Plotter . m 

41  O  9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 2- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 2- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 9- 

±  y  ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

420  %  Description: 

421  %  This  function  plots  a  set  of  nodes  in  a  graph. 

422  %  Usage: 

423  %  Plotter (X,Y, zo) 

424  %  Input: 

425  %  X  -  The  x-coordinates  of  the  nodes 


ill 


426  %  Y  -  The  y-coordinates  of  the  nodes 

427  %  zo  -  Distances  between  the  nodes 

428  %  Output: 

429  %  none 

/ion 

4  JU  oooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

431 

432  function  myplot (X, Y, zo) 

433  plot (X, Y, '  *  '  ) ; 

434  axis  equal 

435  ylim= [ -2  2.5] ; 

436  ylabel ( ’alpha' )  ; 

437  xlabel (’ theta ’ ) ; 

438  t= [ 1 : 2  0 ]  ’ ; 

439  T=num2str ( t ) ; 

440  text (X,  Y,  T ) ; 

441  for  (i=l : size  (zo, 1) ) 

4  42  s=zo  (i,  1 ) ;  e=zo(i,2); 

443  line ( [X ( s ) ;  X(e)],  [Y(s)  ;Y(e)]) 

444  end 
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findcluster2 .m 


o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 

o 


Description : 

The  function  finds  the  best  group  of  nodes  (binary 
subimages)  that  can  be  used  to  determine  the  motion 
vectors . 

Usage : 

[m, subno]  =  f indcluster2 (n, E) 

Input : 

n  -  The  nodes  (binary  subimages)  in  the  graph 

E  -  The  cluster  contains  these  nodes 


Output : 

m  -  The  nodes  (binary  subimages)  that  should  be 

used  to  determine  motion  vectors . 
subno  -  The  index  of  these  nodes. 


%The  following  function  takes  clusters  matrix 
%It  generates  the  data  structure  necssary  to  determine  the 
%  best  cluster  to  use 
function  [m, subno] =myfun (n, E) 
n=sortrows (n, 3) ; 
maxdistance=0 . 3; 
m=*  I  ]  ; 
subno= [ ] ; 

subnocell=cell (10, 1)  ; 

q=  [  ]  ; 

for  i=l : 6 
for  j  =i  +  l :  6 
for  k= j  +1 : 6 

q=[q;  i/j A]; 

end 

end 

end 

M=cell (10,1) ; 

Mi=l ; 

currents  (1,  :  )  ; 

M{ 1 } ^current ; 
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subnocell { 1 }=union (q (current (1) ,  : ) , q (current (2) ,  :  )  )  ; 
if  size  (subnocell { 1 }, 2 ) ==5 
m=M { 1 } ; 

subno=subnocell { 1 }  ; 
return 
end 

for  i=2 : size  (n, 1) 
notinserted=true ; 
currents  ( i ,  :)  ; 

j=i; 

while  j<=Mi  &  notinserted 
if  find  (M {  j  }  ( :  ,  1 :  2 )  ==current  ( 1 )  | ... 

M{ j }  (  :  ,  1 :  2 ) ==current  (2 ) ) 

MM=  [  ]  ; 

MM=[M{j};  current] ; 

if  clusthrecheck (E, MM) <=maxdistance 
M{ j } =MM ; 

subnocell  {  j  }=union  (subnocell  {  j  } , ... 
union (q (current (1) ,  : ) ,  q (curent (2 ),:)))  ; 
if  size ( subnocell { j } ,  2 ) ==5 
m=M{ j } ; 

subno=subnocell { j } ; 
return 
end 

end 

not insert ed=f alse; 
end 

end 

if  notinserted 
Mi=Mi+l ; 

M{Mi } ^current ; 

subnocell {Mi }=union (q (current (1) ,  : ) , q (current (2) , : ) ) ; 
if  size ( subnocell {Mi } ,  2 ) ==5 
m=M { Mi } ; 

subno=subnocell {Mi } ; 
return 
end 
end 
end 
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calcaverages .m 


Description : 

This  function  contains  the  average  motion  vector  values 
for  all  selected  binary  subimages. 

Usage : 

sigmaybar , clbar , c2bar ] = . . . 
calcaverages  (propCluster ,  sizes,  alpha, ... 
theta, sigmax, sigmay, cl, c2) 

Input : 

propCluster, 

sizes  -  The  total  size  of  each  image, 
alpha  -  The  deformation  angle  of  each  pair  of 

images . 

theta  -  The  rotation  angle  of  each  pair  of  images, 

sigmax  -  The  Scaling  factor  in  the  x-axis. 

sigmay  -  The  scaling  factor  in  the  y-axis. 
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o 

o 

cl 

The  shift  in  x- 

■axis  . 

o 

o 

c2 

-  The  shift  in 

the  y-axis 

o 

o 

Output : 

o 

o 

propCluster 

-  The  indices  of  the  nodes  that  contains 

the  best  binary  images  that 

can 

be  used 

to  calculate 

motion  vectors . 

o 

o 

sizes 

-  The  total  sizes  of  these  binary 

images . 

o 

o 

alpha 

-  The  averaged 

deformation  angles 

of  these 

o 

o 

images . 

o 

o 

theta 

-  The  averaged 

rotation  angles 

of 

these 

o 

o 

images . 

o 

o 

sigmax 

-  The  averaged 

scaling  factors 

in 

the  x- 

o 

o 

axes  . 

o 

o 

sigmay 

-  The  averaged 

scaling  factors 

in 

the  y- 

o 

o 

axis 

o 

o 

cl 

-  The  averaged 

shift  in  the  x- 

axis 

o 

o 

c2 

-  The  averaged 

shift  in  the  y- 

axis 

o  o 
o  o 

ooooooooooooo 

9'2'2'2'9'9'9'9'9'9'9'9'9'9'9'S- 

oooooooooooooooc 

IOOOOOOOOOOOOOOO 

o  o  o  o 
o  o  o  o 

ooooooooo 

function  [alphabar,  thetabar,  sigmaxbar, ... 
sigmaybar ,  clbar ,  c2bar ] = . . . 

my fun (propCluster ,  sizes ,  alpha, theta, sigmax, sigmay, cl , c2 ) 

q=[]; 
w=  [  ]  ; 
for  i=l : 6 
for  j  =i  +  l : 6 
for  k= j  +1 : 6 

q=[q;  i/j/k]; 

end 

end 

end 

pc=unique (propCluster  (  : , 1 :  2 ) ) ; 
si=q (pc, : ) ; 
subsizes=sizes  (si) ; 
for  i=l : size ( subsizes , 1 ) 
w (i) =min (subsizes  (i,  :  ) ) ; 
end 
w=w 1  ; 

alphabar=sum (w . * alpha (pc) ) /sum (w) ; 
thetabar=sum (w . * theta (pc) ) / sum (w) ; 
sigmaxbar=sum (w . * sigmax (pc) ) / sum (w)  ; 
sigmaybar =sum (w . * sigmay (pc) ) /sum (w) ; 
clbar=sum (w . *cl (pc) ) /sum (w) ; 
c2bar=sum (w . *c2 (pc) ) /sum (w)  ; 
end 


Evalidation . m 


%  The  validation  of  array  E  -  auxiliary  function  for 
%  GrTheory  Toolbox. 

%  Author:  Sergiy  Iglin 
%  e-mail:  iglin@kpi.kharkov.ua 
%  or:  siglin@yandex.ru 

%  personal  page:  http : / / iglin . exponenta . ru 

oooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  [m,n,newE]  =  Evalidation (E) 
if  ~isnumeric (E) , 
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594  error (' The  array  E  must  be  numeric!') 

595  end 

596  se=size (E) ; 

597  if  length (se) -=2 , 

598  error (' The  array  E  must  be  2D!') 

599  end 

600  if  (se(2)<2), 

601  error (' The  array  E  must  have  2  or  3  columns!'), 

602  end 

603  if  -all (all (E ( : , 1:2) >0) ) , 

604  error ('1st  and  2nd  columns  of  the  array  E  must  be 

605  positive ! ' ) 

606  end 

607  if  -all (all ( (E  ( : ,  1 :  2 ) =— round (E  ( : ,  1 :  2 ) ) ) ) ) , 

608  error  ('1st  and  2nd  columns  of  the  array  E  must  be... 

609  integer ! ' ) 

610  end 

611  m=se  ( 1 ) ; 

612  if  se  (2 ) <3, 

613  E(:,3)=l; 

614  end 

615  newE=E  (  : , 1 :  3)  ; 

616  n=max (max (newE ( : , 1 : 2 ) ) ) ; 

617  return 

618  end 
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clusthre check . m 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  Description: 

%  This  function  check  the  maximum  threshold  between  all 
%  nodes  and  it  outputs  the  maximum  value. 

%  Usage: 

%  maxdis  =  clustercheck (E, a) 

%  Input: 

%  E  -  The  clusters  to  be  checked. 

%  a  -  The  distances  between  the  nodes  within 

%  clusters. 

%  Output : 

%  maxdis  -  The  maximum  distance  between  the  nodes  in  a 
%  cluster 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%This  function  check  the  maximum  distance  (threshold) 
%between  all  nodes  in  the  cluster  a 
function  maxdis=myfun (E, a) 
q=unique ( a  (  :  ,  1 :  2  ) ) ; 
w=  [  ]  ; 

for  i=l : size (q, 1) 
for  j=i+l : size (q, 1 ) 
w—  [w;  q (i) ,  q ( j ) ]  ; 
end 
end 

[tf , loc]  =  ismember (w, E ( : , 1 : 2 ) , ' rows ' ) ; 
maxdis=max (E (loc,  3)  )  ; 
end 
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moveitem . m 


%  Description: 

%  This  function  removes  a  node  from  a  cluster  and  puts 
%  it  to  another  cluster. 


o 

o 


%  Usage: 

%  [src, dist ] =moveitem (src, dist , vals) 

%  Input: 

%  src  :  The  source  value. 

%  dist  :  The  distination. 

%  vals  :  Values  of  all  nodes  in  the  clusters. 
%  Output : 

%  src  :  The  source  value  after  removing. 

%  dist  :  The  distination  after  removing. 


ooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  [ src , dist] =moveitem ( src , dist , vals) 
[tf , loc]  =  ismember (vals, src ' rows ' )  ; 

src=src ( setdif f (l:size(src,l)  ,  loc)  ,  :)  ; 
dist=[dist;  vals]  ; 
end 
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removerelated . m 


Description : 

Remove  loops  in  clusters . 

Usage : 

[m, templist ] ^remove real ted (m, tempi is t , val) 
Input : 


m 


The  nodes  in  the  cluster. 


templist  -  A  temporary  list  used  for  switching  nodes 


val 

Output : 
m 


The  values  of  these  nodes. 


The  nodes  in  the  cluster  after  removing 
loops 

templist  -  A  temporary  list  used  for  switching  nodes 


function  [m, templist] ^remove real ted (m, templist , val) 
[ r ,  c] =f ind (m  ( : ,  1 : 2 ) ==val ( 1 )  I  m ( : , 1 : 2 ) ==val (2 ) ) ; 
d=m (r ,  :  )  ; 

[m, templist] =moveitem (m, templist, d) ; 
end 
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sortcluster . m 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  Description: 

%  This  function  sort  the  clusters  according  to  their 
%  edges. 

%  Usage: 

%  [cl ] ^sortcluster  (c] 

%  Input: 

%  c  -  Unsorted  cluster. 

%  Output : 

%  cl  -  Sorted  cluster. 
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\J  J  O  ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

699 

700  function  [cl]=ext(c) 

701  c 1 = [ ] ; 

7  02  m=  [  ]  ; 

703  h=l ; 

704  i=l ; 

705  while  i<size(c,l) 

706  while  c (i, 1 ) ~=0 

707  i=i+l ; 

708  end 

709  temp=c (h fl-1 ,  : ) ; 

710  temp=sortmat (temp) 

711  m=[m;temp;0  0  0]; 

712  i=i+l ; 

713  h=i ; 

714  end 

715  c 1 =m ; 

716  end 


717 

718 

719 

720 

721 

722 

723 

724 

725 

726 

727 

728 

729 

730 

731 

732 

733 

734 

735 

736 

737 

738 

739 

740 

741 

742 

743 

744 

745 

746 

747 

748 

749 

750 

751 

752 


con4 . m 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  Description: 

%  Given  a  reference  image  and  motion  vectors,  this 
%  function  constructs  a  new  image  based  on  the  reference 
%  image  and  motion  vectors . 

%  Usage: 

%  function  [newlmage ]  =con4  ( imagel ,  theta,  alpha,  sx, ... 

%  sy, cl , c2 ) 

%  Input: 

%  imagel  -  The  reference  image. 

%  theta  -  The  rotation  angle. 

%  alpha  -  The  deformation  angle. 

%  sx  -  The  scaling  factor  in  x-axis. 

%  sy  -  The  scaling  factor  in  y-axis. 

%  cl  -  The  shift  in  x-axis. 


%  c2  -  The  shift  in  y-axis. 

%  Output : 

%  newlmage  -  The  new  constructed  image  base  on  the 

%  reference  image  and  the  motion  vectors. 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  [newlmage ] =myf un ( imagel , theta, alpha, sx,sy,cl,c2) 
close  all; 

A= [sx*cos (alpha-theta)  sx*sin (alpha-theta) ; 
sy*sin (alpha+theta)  sy*cos (alpha+theta) ] ; 
figure ; 

imshow ( imagel ) ; 
for  x=l : size ( imagel , 2 ) 
for  y=l : size ( imagel , 1 ) 
xpos=x*A ( 1 ) +y*A (2 ) +cl ; 
ypos=x*A ( 3 ) +y*A ( 4 ) +c2 ; 
fx  =  floor (xpos); 
fy  =  floor (ypos); 
apha=xpos-f loor (xpos ) ; 
beta=ypos-f loor (ypos ) ; 

if  ~ ( f x+l>size ( f , 2 )  |  f y+l>size (f , 1 )  |  fx<l  |  fy<l) 
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753  newlmage (y, x)  =  ... 

754  image (fy,  fx)  *  (1  -  apha)*(l  -  beta)+... 

755  image (fy,  fx+1)  *  apha* (1-beta)  +... 

756  image (fy+1,  fx)  *  ( 1-apha) *beta  +... 

757  image (fy+1,  fx+1)  *apha*beta  ; 

758  elseif 

759  (fx>0  &&  fy>0)  &&  ( fx==size ( imagel , 2 )  |  | ... 

760  fy==size (imagel, 1) ) 

761  &&  ~ ( f x>size ( imagel , 2 )  | |  fy>size ( imagel , 1 ) ) 

762  newlmage (y, x)  =  image (fy,  fx) ; 

763  else 

764  newlmage (y, x)  =  0; 

765  end 

766  end 

767  end 

768  figure; 

769  imshow (newlmage, []) 

770  end 
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de5  .  m 


Description : 

This  function  takes  two  images  and  plot  the  phase 
difference  of  FFT  of  them. 

Usage : 

[m, n] =de5 ( iml , im2 ) 

Input : 

iml  -  The  first  image. 

im2  -  The  second  image. 

Output : 

m  -  The  u  values  of  the  phase  difference, 

n  -  The  v  values  of  the  phase  difference. 


function  [m, n] =myfun ( iml , im2 ) ; 
close  all 
Fl=fft2 (iml) ; 

F2=fft2 (im2) ; 

Fl=f f tshif t (FI)  ; 

F2=f f tshif t (F2)  ; 

P=F1 . / F2 ; 

Pph=angle  (P) ; 
figure ; 
imshow ( Pph) ; 
s=size ( iml , 1 ) ; ; 
z=s/2; 

Psurf =Pph . * ( s / (2  *pi ) ) ; 

PP=Psurf  (  z ,  : )  ; 

PP=PP’  ; 

figure;  plot(PP); 

QQ=Psurf ( : , z ) ; 
figure;  plot (QQ) ; 
j  P=z ; 

while  PP ( jp+1 ) -PP ( jp) <max (PP) *0 . 3  &&  jp<s 
jp=jp+l; 
end 
ip=z ; 
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808  while  PP (ip) -PP (ip-1) <max (PP) *0 . 3  &&  ip>2 

809  ip=ip-l; 

810  end 

811  jp=jp-i; 

812  P1=PP (ip : jp) ; 

813  m=PP; 

814  p=polyfit ( [ip: jp] ' , PI, 1) 

815  f  =  polyval (p, [ ip : jp] ) ; 

816  figure;  plot ( [ip : jp] ,  PI ,  [ip : jp] ,  f ,  ' -r 1 ) ; 

817  j  q= z ; 

818  while  QQ ( j q+1 ) -QQ ( j q) <max (QQ) *0 . 25  &&  jq<s 

819  jq=jq+i; 

820  end 

821  iq=z; 

822  while  QQ (iq) -QQ (iq-1) <max (QQ) *0 . 25  &&  jq>2 

823  iq=iq-l; 

824  end 

82  5  j  q;=  j  q;—  1 ; 

826  Q1=QQ (iq:  jq)  ; 

827  n=QQ; 

828  p=polyf it ( [iq: jq] 1 , Ql, 1) 

829  f  =  polyval (p, [iq: jq] ) ; 

830  figure;  plot ( [ iq : j q] , Q1 ,  [ iq : j q] ,  f ,  ' -r 1 ) ; 

831  end 
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convld . m 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  Description: 

%  This  function  takes  two  signals  and  plots  the 
%  convolution  of  them. 

%  Usage: 

%  convld (mask, q) 

%  Input: 

%  mask  -  The  first  signal  (which  is  the  digital  filter) 

%  q  -  The  signal  intended  to  be  convoluted. 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

function  myf un (mask, q) 
hl=mask(l,2:7) ; 
ds= [ 1  24]; 
suml=0 ; 
zl=[]  ;z2=[]  ; 

co=  1  r  ’  ;  '  g  '  ;  fkf  ;fbf  ;  ’  c  ’  ;  ’m’]; 

cp=l  ; 

for  (s=l : 3) 

for  (n=6^ds (s) : (336-ds (s) *6) ) 
for  (k=2 : 6) 

suml=suml+ (q(n+ (k-1) ^ds (s) ) +q(n- (k-1) ^ds (s) ) ) *hl (k) ; 
end 

zl (n) =hl (1) *q (n) +suml ; 
suml=0 ; 
end 

h2=mask (2,2:9) ; 
suml=0 ; 

for  (n=8*ds (s) : (336-ds (s) *8) ) 
for  (k=2 : 6) 

suml=suml+ (q(n+ (k-1) ^ds (s) ) +q(n- (k-1) *ds (s) ) ) ^h2 (k) ; 
end 
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864  z2 (n)=h2 (1) *q(n)+suml; 

865  suml=0; 

866  end 

867  plot ( zl , co (cp) ) ; 

868  hold  on; 

869  cp=cp+l; 

870  plot ( z2 , co (cp) ) ; 

871  cp=cp+l; 

872  end 

873  hold  on; 

874  plot ( [0  400] ,  [ 0  0] , 1 k' ) 

875  end 
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fullrangephi .m 


oooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  Description: 

%  This  function  takes  the  values  of  rotation  angles  and 
%  out  them  as  full  range. 

%  Usage: 

%  [q] =f ullrangephi (phi, FramesNumber) 

%  Input: 

%  phi  -  The  rotation  angel. 

%  FramesNumber  -  The  number  of  the  frame  which  the  phi 

%  belong  to. 

%  Output : 

%  q  -  A  full-ranged  alpha. 


function  [q] =myfun (phi, FramesNumber) 

q=[]; 

q ( 1 ) —phi ( 1 ) ; 
cdif=0; 

for  ( i=2 : FramesNumber ) 
dif=phi (i) -phi ( 1 — 1 ) ; 
if  dif >2 

cdif =cdif-2 *pi ; 
end 

if  dif <-2 
cdif =cdif +2  *pi ; 
end 

q (i) =phi (i) +cdif ; 
end; 
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j  d .  m 

oooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  Description: 

%  This  function  takes  plots  the  zero-crossing  values  that 
%  yield  from  convolution  of  the  rotation  angle  and  the 
%  mask. 

%  Usage: 

%  j d (moments , mask) 

%  Input: 

%  moments  -  The  moments  of  the  processed  frames. 

mask  -  The  digital  filter  used  in  the  convolution 

oooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  j d (moments , mask) 
phi=atan2 (moments ( : , 2 ) , moments ( : , 3 ) ) ; 
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918  q=f ull_range_phi (phi, size (moments, 1) ) ; 

919  plot (q) ; 

920  convld (mask, q) 

921  end 


get_moments .m 
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%  Description: 

%  This  function  takes  an  image  and  calculates  the  moments 
%  of  this  image. 

%  Usage: 

%  [weight, S00 , MOO , M01 , M10 , Mil , M20 , M0 2 ] =get_moments (img) 


o 

o 

Input : 

o 

o 

img 

-  The 

input  image . 

o 

o 

Output : 

o 

o 

weight 

-  The 

weight  used  to  calculate  the  moments 

o 

o 

MOO 

-  The 

MOO  moment. 

o 

o 

M01 

-  The 

M01  moment. 

o 

o 

M10 

-  The 

M10  moment. 

o 

o 

Mil 

-  The 

Mil  moment. 

o 

o 

M20 

-  The 

M20  moment. 

o 

o 

M02 

-  The 

M0 2  moment. 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  [weight, S00 , MOO , M01 , M10 , Mil , M20 , M0 2 ] =myfun (img) 
ii=32  0 ; 
j j=240; 

w_sigma_len=105 ; 
for  ( i=l : ii ) 
for  ( j-1 : j  j ) 

u ( j  ,  i)  =  ( (i-ii/2) A2+(j-jj/2) A2) / (2 *w_sigma_lenA2 ) ; 
end; 
end; 

weight=255*exp (-l*u . A2 ) ; 

weight=u; 

im=double (im) ; 

S  0  0  =  0 ; 

S 1 0  =  0 ; 

S  0 1  =  0  ; 

SI 1=0 ; 

S  2  0  =  0 ; 

S  0  2  =  0  ; 

for  (x=l : ii ) 
for (y=l : j j ) 
cx=x-ii/ 2 ; 
cy=y-j j/2; 

S00=S00+im(y,x) * weight (y, x) ; 

S10=S10+cx  *  im(y,x)  *  weight (y,x); 

S01=S01+cy  *  im(y,x)  *  weight (y,x); 

SI 1=S1 l+cx*cy*im (y, x) * weight (y,  x)  ; 

S2 0=S2 0+cx*cx  *im (y, x) * weight (y, x) ; 

S02=S02+cy^cy  *im (y, x) ^weight (y, x) ; 
end; 
end; 

M00=S00 ; 

Ml 0=S1 0 ; 
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973 

M01=S01; 

974 

M11=S11; 

975 

M20=S20 ; 

976 

M02=S02 ; 

977 

end 
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Createmovie . m 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

%  Description: 

%  This  function  create  a  movie  out  of  a  set  of  images. 

%  Usage: 

%  [ avif ilename ] =Createmovie (A, f rames_location) 

%  Input: 

%  A  -  Set  of  images. 

%  f rames_location  -  The  location  of  the  images. 

%  Output : 

%  avif ilename  -  The  file  name  of  the  movie. 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

function  [ avif ilename ] =myfunc (A, f rames_location) 
aviobj  =  avif ile (avif ilename,  ’ fps  ',  28 ) ; 
icon= [ ] ; 
num=size (A, 2 ) ; 
for  ( j  =1 : num) 
i=A  ( j  )  ; 
if  i<10 

image  l=s  treat  ( frame  s_lo  cat  ion , ... 

1  frame 00 ' , int2str (i) ,  '  .bmp ' ) 
elseif  i<100 

image  l=s  treat  ( frame  s_lo  cat  ion,  1  f  rameO  1  , ... 
int2str (i)  ,  ' .bmp ' ) 
else 

image  l=s  treat  ( frame  s_lo  cat  ion,  '  frame  1  , ... 
int2str (i)  ,  ' .bmp ' ) 
end 

if  ~ismember ( i , [115:126,230:239]) 
im=imread ( imagel ) ; 
imshow ( im) ; 
frame=get frame ; 

aviobj  =  addframe (aviobj , frame) ; 
end 
end 

aviobj  =  close (aviobj ) ; 
end 


get_angle_dif . m 

1  U  1  J  ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 

1016  %  Description: 

1017  %  This  function  calculated  the  rotation  angles  difference 

1018  between  two  images. 

1019  %  Usage: 

1020  %  [ang_dif ] =get_angle_dif (iml, im2) 

1021  %  Input: 

1022  %  iml  -  The  first  image. 

1023  %  im2  -  The  second  image. 

1024  %  Output: 
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ang_dif  -  The  difference  in  rotation  angles  between  the 
two  images . 


function  [ang_dif ] =myfunc (iml, im2) 
[M00_1,M01_1,M10_1,M11_1,M20_1,M02_1] =get_moments (iml) ; 
[MO 0_2  , M0 1_2 , Ml 0_2 , Ml 1_2 , M2 0_2  , M02_2 ] =get_moments (im2)  ; 
Fl  = [M2 0_1  Ml 1_1 ;  Mll_l  M02_l]; 

F2  = [M2 0_2  Ml 1_2 ;  Mll_2  M02_2]; 

[VI, Dl]  =  eig(Fl) ; 

[V2,D2]  =  eig (F2 ) ; 
angl=atan2 (VI (2,1) , VI (1, 1) ) ; 
ang2=atan2 (V2(2,1),V2(1,1)); 
ang_dif =ang2-angl ; 
end 
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vsplitter2 . m 


%  Description: 

%  This  function  split  a  movie  into  images. 

%  Usage: 

%  [ iml ] =vsplitter2 (movief ilename, destinationf ilename) 

%  Input: 

%  movief ilename  -  The  movie's  file  name. 

%  destinationf ilename  -  The  name  of  the  extracted 

%  images . 

%  Output : 

%  iml  -  The  image  file  name. 

ooooooooooooooooooooooooooooooooooooooooooooooooooooooooooo 


function  [ iml ] =myfun (movief ilename, destinationf ilename) 
mov=aviread (movief ilename) ; 
for  i=l : size (mov, 2 ) 
iml=mov(i) . cdata; 
if  i<10 

filename's treat (destinationf ilename, int2str ( i ) , 1 .bmp ' ) 
elseif  i<100 

filename's treat (destinationf ilename, int2str (i)  ,  '  .bmp  1 ) 
elseif  i>=100 

filename's treat (destinationf ilename, int2str (i) ,  1  .bmp  1 ) 
end 

imwrite (iml, filename) ; 
end 
end 
end 
end 
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